SAX符号化序列范例源码
--------------------
timeseries2symbol.m:
--------------------
This function takes in a time series and convert it to string(s).
There are two options:
1. Convert the entire time series to ONE string
2. Use sliding windows, extract the subsequences and convert these subsequences to strings
For the first option, simply enter the length of the time series as "N"
ex. We have a time series of length 32 and we want to convert it to a 8-symbol string,
with alphabet size 3:
timeseries2symbol(data, 32, 8, 3)
For the second option, enter the desired sliding window length as "N"
ex. We have a time series of length 32 and we want to extract subsequences of length 16 using
sliding windows, and convert the subsequences to 8-symbol strings, with alphabet size 3:
timeseries2symbol(data, 16, 8, 3)
Input:
data is the raw time series.
N is the length of sliding window (use the length of the raw time series
instead if you don't want to have sliding windows)
n is the number of symbols in the low dimensional approximation of the sub sequence.
alphabet_size is the number of discrete symbols. 2 <= alphabet_size <= 10, although alphabet_size = 2 is a
special "useless" case.
Output:
symbolic_data: matrix of symbolic data (no-repetition). If consecutive subsequences
have the same string, then only the first occurrence is recorded, with
a pointer to its location stored in "pointers"
pointers: location of the first occurrences of the strings
N/n must be an integer, otherwise the program will give a warning, and abort.
The variable "win_size" is assigned to N/n, this is the number of data points on the raw
time series that will be mapped to a single symbol, and can be imagined as the
"compression rate".
The symbolic data is returned in "symbolic_data", with pointers to th
1