DocWire SDK
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing possible for security and confidentiality
docwire::local_ai::embed Class Reference

A chain element that generates embeddings for input text using a local AI model. More...

#include <local_ai_embed.h>

Inheritance diagram for docwire::local_ai::embed:
docwire::chain_element docwire::with_pimpl< embed > docwire::with_pimpl< chain_element > docwire::with_pimpl_base docwire::with_pimpl_base

Public Member Functions

 embed (std::shared_ptr< model_runner > model_runner, std::string prefix)
 Construct a local AI embed chain element with a specific model runner and prefix. More...
 
 embed (std::string prefix)
 Construct a local AI embed chain element with a default model runner and prefix. More...
 
continuation operator() (message_ptr msg, const message_callbacks &emit_message) override
 
bool is_leaf () const override
 Check if chain element is a leaf (last element which doesn't produce any messages). At this moment only exporters are leafs. More...
 
- Public Member Functions inherited from docwire::chain_element
 chain_element (chain_element &&)=default
 
chain_elementoperator= (chain_element &&)=default
 
virtual bool is_generator () const
 

Static Public Attributes

static const std::string e5_passage_prefix
 Common prefix for passage embeddings with E5 models.
 
static const std::string e5_query_prefix
 Common prefix for query embeddings with E5 models.
 

Additional Inherited Members

- Protected Types inherited from docwire::with_pimpl< chain_element >
using impl_type = pimpl_impl< chain_element >
 
- Protected Types inherited from docwire::with_pimpl< embed >
using impl_type = pimpl_impl< embed >
 
- Protected Member Functions inherited from docwire::with_pimpl< chain_element >
impl_typecreate_impl (Args &&... args)
 
 with_pimpl (Args &&... args)
 
 with_pimpl (with_pimpl< chain_element > &&other) noexcept
 
 with_pimpl (std::nullptr_t)
 
with_pimploperator= (with_pimpl &&other) noexcept
 
impl_typeimpl ()
 
const impl_typeimpl () const
 
- Protected Member Functions inherited from docwire::with_pimpl< embed >
impl_typecreate_impl (Args &&... args)
 
 with_pimpl (Args &&... args)
 
 with_pimpl (with_pimpl< embed > &&other) noexcept
 
 with_pimpl (std::nullptr_t)
 
with_pimploperator= (with_pimpl &&other) noexcept
 
impl_typeimpl ()
 
const impl_typeimpl () const
 

Detailed Description

A chain element that generates embeddings for input text using a local AI model.

This class is a chain element that takes a model_runner to generate a vector embedding for a given text. It is designed to work with sentence-transformer models like multilingual-e5-small.

Definition at line 31 of file local_ai_embed.h.

Constructor & Destructor Documentation

◆ embed() [1/2]

docwire::local_ai::embed::embed ( std::shared_ptr< model_runner model_runner,
std::string  prefix 
)
explicit

Construct a local AI embed chain element with a specific model runner and prefix.

Parameters
model_runnerThe model runner to use for generating embeddings.
prefixThe string to prepend to the input text. Use an empty string for no prefix.

◆ embed() [2/2]

docwire::local_ai::embed::embed ( std::string  prefix)
explicit

Construct a local AI embed chain element with a default model runner and prefix.

This constructor initializes the embedder with a default model_runner configured to use the multilingual-e5-small-ct2-int8 model.

Parameters
prefixThe string to prepend to the input text. Use an empty string for no prefix.

Member Function Documentation

◆ is_leaf()

bool docwire::local_ai::embed::is_leaf ( ) const
inlineoverridevirtual

Check if chain element is a leaf (last element which doesn't produce any messages). At this moment only exporters are leafs.

Returns
true if leaf

Implements docwire::chain_element.

Definition at line 58 of file local_ai_embed.h.


The documentation for this class was generated from the following file: