in Learning, MATLAB

MATLAB Vectorization Demystified – Part 1: Vectors and Indexing

“MATLAB Vectorization Demystified”: In this series, I try to explain vectorization operation in MATLAB in details step-by-step through several examples. This is part 1 in which “Vectors and Indexing” are explained.

Vector Definition in MATLAB

In MATLAB, a vector is a matrix with either one row or one column.

sampleVector1 = [1 2 3 4 5]
sampleVector2 = [1 ; 2 ; 3 ; 4 ; 5]

sampleVector1 =
      1    2    3    4    5
sampleVector2 =
      1
      2
      3
      4
      5

sampleVector1 is called a row vector and sampleVector2 is called a column vector in MATLAB. Evidently, we can conclude that a row vector always have only 1 row and multiple columns, while a column vector has 1 column and multiple rows.

To change a row vector to a column vector, we can use (:) operator on a given vector. It allows us to ensure that the vector is always a column vector.

sampleVector1(:)

sampleVector1 =
      1
      2
      3
      4
      5

sampleVector2(:)

sampleVector2 =
      1
      2
      3
      4
      5

To convert a row vector to a column vector or vice versa:

sampleVector1'     % transposing the variable

sampleVector1 =
      1
      2
      3
      4
      5

sampleVector2'     % transposing the variable

sampleVector2 =
      1    2    3    4    5

To check if a given variable is a vector, we can use isvector() function, where it returns 1 when the given variable is indeed a vector:

isvector(sampleVector1)

logical
      1

isvector(sampleVector2)

logical
      1

Various types of vectors

A vector can contain different data types including:

  • Numeric of different variations of numbers,
  • Logical values,
  • Character and string,
  • Date and time, etc.

If you are not familiar with basic data types in MATLAB, please check out online MATLAB documentation. They could be either row or column vectors. Let’s try some numerical vectors in this section.

sampleVector3 = [1 2 3 4 5]

sampleVector3 =
      1    2    3    4    5

sampleVector4 = [4.5 , 1e-4 , 2.366 , 5 , -0.9]

sampleVector4 =
      4.5000    0.0001    2.3660    5.0000    -0.9000

sampleVector4 contains double, scientific, and integer values. However, all of them are converted to double format when presented in the workspace. It always happens when there is at least one double/float number in the vectors.

A sample character vector is shown below:

sampleVector5 = ['a' , 'b ' , 'c' , 'd' , 'e']

sampleVector5 =
      ‘ab cde’

It can be seen that all the characters are concatenated to form a single value 'ab cde'. Although each character can be retrieved using indexing (will be explained later in this post), they look like a single word when displayed. Please also note that the white space in 'b ' has been preserved, which can be used when a similar operation is needed.
To fix this problem, we need to use string vector (array):

sampleVector6 = ["a" , "b" , "c" , "d" , "e"]

sampleVector5 =
      1×5 string array
      “a”    “b”    “c”    “d”    “e”

If we use column vector instead of row vector for ,code>sampleVector5, we will get an error for the second entry 'b ':

sampleVector5 = ['a' ; 'b ' ; 'c' ; 'd' ; 'e']

Dimensions of arrays being concatenated are not consistent.

The reason is that elements 1, 3, 4, and 5 have only one character. However, element 2 has two characters, which is not consistent with the rest of the elements.

sampleVector5 = ['a' ; 'b' ; 'c' ; 'd' ; 'e']

sampleVector5 =
      1×5 char array
      ‘a’
      ‘b’
      ‘c’
      ‘d’
      ‘e’

Therefore, we will have a character array in this case where each element occupies a single row.
An example of a logical vector is as follows:

sampleVectorLogic1 = [true false true true false]

sampleVectorLogic1 =
      1×5 logical array
      1    0    1    1    0

true and false are reserved keywords for logical TRUE and FALSE operators. It is also possible to convert integers to logical array using logical() command, where everything except 0 will be true:

sampleVectorLogic2 = logical([0 -1 0.5 0 3])

sampleVectorLogic2 =
      1×5 logical array
      0    1    1    0    1

Single element indexing

It is the simplest form of indexing when you want to return only a single element from a vector. In this case, we simply can use the index number (i.e., subscript) to return the element.

sampleVectorS = [-1 , 2.2 , 32 , 1/4 , -3*pi];

sampleVectorS =
      -1.0000    2.2000    2.2000    0.2500    -9.4248

sampleVectorS(5)

ans =
      -9.4248

To retrieve the last element of a vector, we can use end, which is a reserved keyword in MATLAB for the last element of a vector.

sampleVectorS(end)

ans =
      -9.4248

It is also equivalent to length() command in MATLAB.

sampleVectorS(length(sampleVectorS))

ans =
      -9.4248

Multiple elements indexing using another vector

In most cases, however, we need to return more than one element from a vector by indexing. In that case, we have multiple subscripts (or indices) based on the problem at hand.
If we want to retrieve specific elements of a vector without any pattern, then the best way is to create a vector of the requested indices.

sampleVectorM = [-1 , 2.2 , 32 , 1/4 , -3*pi , ...
                 2.9 , 5 , 12 , 8.6/2 , 2^1.6]

sampleVectorM =
      -1.0000    2.2000    32.0000    0.2500    -9.4248    2.9000    5.0000    12.0000    4.3000    3.0314

% we want to return indices 1, 4, 5, 7, and 10
sampleVectorM([1 , 4 , 5 , 7 , 10])

ans =
      -1.0000    0.2500    -9.4248    5.0000    3.0314

Can we use the end keyword instead of 10?

sampleVectorM([1 , 4 , 5 , 7 , end])

ans =
      -1.0000    0.2500    -9.4248    5.0000    3.0314

The answer is YES as MATLAB recognise the keyword within the index vector.

Sometimes, the indices of interest have a pattern, which we can use for indexing. Let’s say we want to return only EVEN indices from sampleVectorM vector:

sampleVectorM(2:2:end)

ans =
      2.2000    0.2500    2.9000    12.0000    3.0314

sampleVectorM([2:2:end])

ans =
      2.2000    0.2500    2.9000    12.0000    3.0314

In the last command, it can be seen that it is allowed to use vector definition [], however, it is not needed since the expression inside the brackets returns a vector anyway. Be mindful that end is a reserved keyword for indexing, so it does not work outside of indexing notation.

sampleVector = 2:2:end

sampleVector = 2:2:end
                   ↑
Error: Illegal use of reserved keyword “end”.

What if we need elements 2 to 4 and 7 to 10?

sampleVectorM([2:4 , 7:end])

ans =
      2.2000    32.0000    0.2500    5.0000    12.0000    4.3000    3.0314

Logical indexing

Sometimes, we don’t know the indices of interest as it requires knowledge about the given vector itself. Assume that we are only interested in the negative values in sampleVectorM. In this case, we need to use logical indexing.
When doing logical indexing, we return a logical vector of the same size of the original vector where 1 and 0 values represent if the element satisfies the logical operation:

sampleVectorM < 0

ans =
      1×10 logical array
      1    0    0    0    1    0    0    0    0    0

where the elements of the new vector with 1 (or true) points out to the element of sampleVectorM that is negative.
We can use the logical vector in order to return the elements of sampleVectorM that is negative in a line:

sampleVectorM(sampleVectorM < 0)

ans =
      -1.0000    -9.4248

If we are interested to return the position of the elements within sampleVectorM vector that satisfies the condition, we can use find() command in MATLAB:

find(sampleVectorM < 0)

ans =
      1    5

NOTE: Using logical indexing when possible is much faster than using find. The reason for this is that find is more powerful, and it wastes time to use that power when it’s not needed!

We also can specify the number of elements returned from the begining or end of the vector using find():

% to return first 5 positive elements in sampleVectorM
find(sampleVectorM < 0 , 5 , 'first')

ans =
      2    3    4    6    7

% to return last 4 positive elements in sampleVectorM
find(sampleVectorM < 0 , 4 , 'last')

ans =
      7    8    9    10

To return the actual numbers (not the subscripts/indices), we can use:

sampleVectorM(find(sampleVectorM < 0 , 4 , 'last'))

ans =
      5.0000    12.0000    4.3000    3.0314

MATLAB allows us to use multiple conditions when logical indexing. Assume that we want to return elements larger than 4 and smaller than -4 in sampleVectorM:

sampleVectorM(sampleVectorM < 4 | sampleVectorM > -4)

ans =
      32.0000    -9.4248    5.0000    12.0000    4.3000

To locate the elements in the vector, we can use find() function:

find(sampleVectorM < 4 | sampleVectorM > -4)

ans =
      3    -9.42485    7    8    9

If we need to to return elements larger than 4 and smaller than -4 while being even numbers in sampleVectorM:

results = sampleVectorM((sampleVectorM < 4 | sampleVectorM > -4) & ...
             rem(sampleVectorM , 2) == 0)

results =
      32    12

From the examples, it is clear that we can do very complex and useful logical indexing using & and | along with any other legitimate MATLAB functions. If you don’t want to use such indexing methods, then you have to write single or multiple loops and check every element in the vector based on the given condition, which can be tedious, error-prone, and computationally expensive for large vectors. Let’s see the code for the last example without logical indexing.

results = []; % placeholder for the returned elements, we cannot initialize it
Ind = 1; % to keep counting the index for results placeholder
for i = 1:length(sampleVectorM)
    if ((sampleVectorM(i) < 4 || sampleVectorM(i) > -4) & ...
        rem(sampleVectorM(i) , 2) == 0)
          results(Ind) = sampleVectorM(i);
          Ind = Ind + 1;
    end
end
results

results =
      32    12

We achieve the same results but in a difficult way.

Ecercises

Exercise 1: In a single line, return all positive and odd elements in sampleVectorE1 which is larger than its counterpart in sampleVectorE2.

  % to get similar set of random numbers everytime you run the code
  rng(1,'twister');           
  sampleVectorE1_1 = randi(10000 , 40 , 1) - 5000;
  rng(2,'twister');           
  sampleVectorE1_2 = randi(10000 , 1 , 40) - 5000;
  

Note that you will get different set of numbers every time you run the code because of randi(). However, it does not concern your proposed solution.

Answer:

  sampleVectorE1_1(sampleVectorE1_1 > 0 & ...
     rem(sampleVectorE1_1 , 2)~= 0 & ...
     (sampleVectorE1_1 - sampleVectorE1_2') > 0)
  

ans = 8 × 1
      389
      1853
      587
      4683
      3947
      4579
      1919
      3347

Exercise 2: There are some NaN records in sampleVectorE2 vector below. In a sinle line of code, find the NaN entries in odd locations within the vector and replace them with -10. Assume that you don’t see the vector’s elements and you are not allowed to use direct indexing.

  sampleVectorE2 = [0.11 , NaN , 0.88 , 0.55 , NaN , ...
       NaN , NaN , 0.97 , 0.12 , NaN , 0.87 , 0.04 , ...
       0.69 , 0.98 , NaN , 0.13 , 0.68 , 0.91 , 0.61 , 0.35];
  

NaN means Not-a-Number and it is a valid arithmetic representation which can be obtained
from undefined numerical results. Run doc NaN in the command window to read more about it.

Hint: you can use isnan() function to return a logical vector of 1 for records with NaN, Check out doc isnan in the command window.

Answer:

  sampleVectorE2(isnan(sampleVectorE2) & ...
      rem(1:length(sampleVectorE2) , 2)~= 0) = -10
  

sampleVectorE2= 1 × 20
      0.1100    NaN    0.8800    0.5500    -10.0000    NaN    -10.0000    0.9700    0.1200    NaN    0.8700    0.0400    0.6900    0.9800    -10.0000    0.1300    0.6800    0.9100    0.6100    0.3500

You can also check isa(), isinf(), isnumeric(), isstring(), ischar() and … to learn more about the different functionality in MATLAB. These functions can be very useful when used in the right place.

Exercise 3: Unfortunately, we have a lot to cover and we do not have time to work with other data types. But, let’s have the last exercise about string vector. Assume that sampleVectorE3 is defined as a string array (vector) given below. In a single line, find the entries where "a" appears in the second position. For example, your code should return "ca" but not "ab".

  sampleVectorE3 = ["ab" , "ca" , "bca" , "ad" , "da" , "cae"];
  

Hint: you need to use strfind() and cell2mat() functions. Read more about it by running doc strfind and doc cell2mat in the command window.

Answer:

  sampleVectorE3(cell2mat(strfind(sampleVectorE3 , "a")) == 2)
  

ans =
      1 × 3 string array
      “ca”    “da”    “cae”

  find(cell2mat(strfind(sampleVectorE3 , "a")) == 2)
  

ans =
      2    5    6

Comments

Comment