Splitting a delimited string into pieces
Problem
You want to split a comma- or tab-delimited string into a vector containing strings or numbers.
Solution
There are two general methods here. The first one listed is simpler, and will handle a single string using more basic Matlab functions. The second method uses the textscan() function; it is more powerful and flexible, but it may be slower.
First method
This will handle strings with mixed types (strings and numbers) and return a cell array containing those values. Strings remain strings; number strings are converted to numbers. If none of the input is numbers, the part that attempts and tests the number conversion can be removed.
string = 'one,two,,four'; delimiter = ','; % For tabs, use: delimiter = sprintf('\t'); % Find the delimiters delimIdx = find(string == delimiter); % Pretend there are delimiters at the beginning and end, for the loop below delimIdx = [0 delimIdx length(string)+1]; % Preallocate cell array to hold substrings subStrings = cell(1, length(delimIdx) - 1); % Process each element for i = 1:length(subStrings) % Find the text between the delimiters %(don't include the delimiters) startOffset = delimIdx(i) + 1; endOffset = delimIdx(i+1) - 1; % Get the element txt = string(startOffset:endOffset); % Attempt conversion to number num = sscanf(txt, '%f'); % Number conversion successful if no error message if isempty(num) subStrings{i} = txt; else subStrings{i} = num; end end % Print out the strings subStrings
If you know that your string is all numbers, it may be more convenient to return a normal (non-cell) vector with the values.
string = '1,2,,4'; delimiter = ','; % Find the delimiters delimIdx = find(string == delimiter); % Pretend there are delimiters at the beginning and end, for the loop below delimIdx = [0 delimIdx length(string)+1]; % Preallocate an array to hold values values = zeros(1, length(delimIdx) - 1); % Process each element for i = 1:length(values) % Find the text between the delimiters %(don't include the delimiters) startOffset = delimIdx(i) + 1; endOffset = delimIdx(i+1) - 1; % Get the element txt = string(startOffset:endOffset); % Attempt conversion to number num = sscanf(txt, '%f'); % If error or empty number, assign NaN; otherwise assign the number if isempty(num) values(i) = NaN; else values(i) = num; end end % Print out the strings values
Second method
The textscan() function converts strings to cell arrays containing other cell arrays. Note that this is different from a two-dimensional cell array.
str = 'asdf,35,4,w,2'; values = textscan(str, '%s%f%f%s%f', 'delimiter', ','); values{2}(1) % returns 35 values{2} % also returns [35], which is equivalent to 35, since it's a 1x1 matrix values{1}{1} % returns 'asd' values{1} % returns a 1x1 cell array: { 'asdf' }
In the format specification string:
%fmeans to convert to a floating point number.%dmeans to convert to an integer.%smeans to convert to a string.
The reason textscan() returns these double cell arrays is because it is designed to operate over multi-line strings.
% This string has two lines: % asdf,35,4,qwerty,2 % foo,56,32,bar,5 str = sprintf('asdf,35,4,qwerty,2\nfoo,56,32,bar,5'); values = textscan(str, '%s%f%f%s%f', 'delimiter', ','); values{2}(1) % returns 35 values{2}(2) % returns 56 values{2} % returns [35; 56] values{1}{1} % returns 'asdf' values{1}{2} % returns 'foo' values{1} % returns a 2x1 cell array: { 'asdf' 'foo' }
If the input is all numbers, it can be converted from a cell array to a regular array with cell2mat(). This function requires that the numbers are all the same type, such as %f%f%f or %d%d%d. If you have a mixture of the two types, use all floating point numbers.
str = '4,3,,56.32'; values = textscan(str, '%f%f%f%f', 'delimiter', ','); cell2mat(values)
For a tab-delimited string, use sprintf() to escape the tab character.
values = textscan(str, '%f%f%f%f', 'delimiter', sprintf('\t'));
Notes
Another way to split a string on delimiters is the strtok() function, but this function will collapse consecutive delimiters which may cause problems if you have any empty values.