While they are not limited to the Content Pattern Extractor, they form the cornerstone of the app and will be used heavily.
var lines = text.split('\n').filter(function(item) {
return item != '';});
var ret = [];var i;
var topLimit = params.limit 1 + params.offset 1;for (i = params.offset; i < topLimit; i++) {
if (lines[i] == '') {continue;
}ret.push({
id: i*1 + 1,offset: i - params.offset,
content: lines[i]});
}return cb({
success: true,data: ret
});This will allow the data to become addressable
and linkable
. Other pipeline processors such as RAG will often provid a link to the source of the generated data by linking to an LBL grid.
block
within the stream:
var blockDefinition = {
offset: [-2, 0],rowBegin: '[bookmark: ',
rowEnd: 'Leave a comment',fields: {
reviewerName: [0],revieweeName: [1],
reviewDate: [3]},
blocks: [{rowBegin: 'Review Details:',
rowEnd: 'Rate reviews',fields: {
reviewItem1: [1, 1],reviewItem2: [27],
}}]
};The rowBeing/End will define the boundaris for the block. If ommited, they will default to the first line and last line of the stream.
The offset will allow you to adjust the boundaries for the block initially found by rowBegin and rowEnd.
The fields will define how data should be extracted within the block. The format is:
fields: {
fieldName: [start(BLN), length]}
start is the the BLN (block line number) where to start the extraction, starting with index 0. If start is positive and greater than, or equal, to the length of the block the field will be set to empty. If start is negative, it uses it as an index from the end of the block.
if length is not specified, the rest of the block will be attached to field value.
Izy Cloud Tika Server: http://localhost:9998/