python – Creating batches of sequences for pytorch LSTM

I’m currently working on a LSTM Autoencoder using pytorch. I have a big amount of samples. Each sample contains 120 features. For now, I’m creating sequences of length 1, batch_size is equal to 1 and everything is working fine. I first convert my data array to a list and then using the following function, I convert them to sequences of length 1:

def dataset(mydatalist):
    dataset = [torch.tensor(s).unsqueeze(1) for s in mydatalist] 
 
    n_seq, seq_len, n_features = torch.stack(dataset).shape 
    return dataset, seq_len, n_features

Then for training, I write the following procedure:

    model.train()
    optimizer = args.optim

    criterion = args.loss

    history = dict(train=[], val=[])
    for epoch in range(1 , args.epoch + 1):
        ts = time.time()
        model = model.train()
        train_losses = []

        for seq_true in train_dataset:
            optimizer.zero_grad()

            seq_true = seq_true.to(args.device)
            seq_pred = model(seq_true, args)
    
            loss = criterion(seq_pred, seq_true)
            
            loss.backward()
            optimizer.step()

            train_losses.append(loss.item())

        val_losses = []
        model = model.eval()
        with torch.no_grad():
            for seq_true in val_dataset:

                seq_true = seq_true.to(args.device)
                seq_pred = model(seq_true, args)

                loss = criterion(seq_pred, seq_true)

                val_losses.append(loss.item())
        te = time.time()
        train_loss = np.mean(train_losses)
        val_loss = np.mean(val_losses)
        history['train'].append(train_loss)
        history['val'].append(val_loss)
        
        print(f"Epoch: {epoch} of {args.epoch},  train loss: {train_loss},  val loss: {val_loss}, time: {te-ts}s.")

Everything works fine and I am able to train my LSTM Autoencoder. But as I have a large number of training samples, the procedure is too slow. I had two ideas for speeding up the procedure. My first idea was to increase sequence length. In order to achieve that I changed my dataset function to the following function:

def dataset(sequences, seq_len):
  dataset = []
  for i in range(0, len(sequences), seq_len):
    dataset.append(torch.tensor(sequences[i:i + seq_len]))   
  n_seq, seq_len, n_features = torch.stack(dataset).shape
  return np.array(dataset), seq_len, n_features

My second idea was to increase the batch_size. As you can see in my code, “for seq_true in train_dataset” implies batch_size = 1. I tried to feed several sequences at once to my neural network. Both of my approaches failed with shape-related errors in my neural network like this one:

RuntimeError: shape ‘x’ is invalid for input of size y

As my neural network code supports batching, this error clearly implies that I have problems in creation of sequences or batches with length more than one. Unfortunately I was not able to fix it by myself no matter how hard I try. Could you please point out how to achieve these two goals?
First creating sequences with length more than 1 and then batching those sequences with batch_size more than one so I can speed up my training procedure.

Read more here: Source link