gnes.flow.base module¶

class gnes.flow.base.BaseIndexFlow(*args, **kwargs)[source]¶

Bases: gnes.flow.Flow

BaseIndexFlow defines a common service pipeline when indexing.

It can not be directly used as all services are using the base module by default. You have to use set() to change the yaml_path of each service.

train(bytes_gen=None, callback=<function remove_envelope>, **kwargs)¶

Do training on the current flow

It will start a CLIClient and call train().

Example,

with f.build(backend='thread') as flow:
    flow.train(txt_file='aa.txt')
    flow.train(image_zip_file='aa.zip', batch_size=64)
    flow.train(video_zip_file='aa.zip')
    ...

This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.

One may also build a reader/generator on your own.

Example,

def my_reader():
    for _ in range(10):
        yield b'abcdfeg'   # each yield generates a document for training

with f.build(backend='thread') as flow:
    flow.train(bytes_gen=my_reader())
Parameters:
  • bytes_gen (Optional[Iterator[bytes]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.
  • kwargs – accepts all keyword arguments of gnes client CLI
class gnes.flow.base.BaseQueryFlow(*args, **kwargs)[source]¶

Bases: gnes.flow.Flow

BaseIndexFlow defines a common service pipeline when indexing.

It can not be directly used as all services are using the base module by default. You have to use set() to change the yaml_path of each service.

train(bytes_gen=None, callback=<function remove_envelope>, **kwargs)¶

Do training on the current flow

It will start a CLIClient and call train().

Example,

with f.build(backend='thread') as flow:
    flow.train(txt_file='aa.txt')
    flow.train(image_zip_file='aa.zip', batch_size=64)
    flow.train(video_zip_file='aa.zip')
    ...

This will call the pre-built reader to read files into an iterator of bytes and feed to the flow.

One may also build a reader/generator on your own.

Example,

def my_reader():
    for _ in range(10):
        yield b'abcdfeg'   # each yield generates a document for training

with f.build(backend='thread') as flow:
    flow.train(bytes_gen=my_reader())
Parameters:
  • bytes_gen (Optional[Iterator[bytes]]) – An iterator of bytes. If not given, then you have to specify it in kwargs.
  • kwargs – accepts all keyword arguments of gnes client CLI