Class: Parse::Embeddings::Fixture

Inherits:
Provider
  • Object
show all
Defined in:
lib/parse/embeddings/fixture.rb

Overview

Deterministic, zero-network embedding provider for tests.

Vectors are derived from a SHA-256 of (model_name, input_type, input): the same input always produces the same vector, different inputs produce different vectors, and :search_query vs :search_document produce different vectors for the same string (so cache-key bugs and input-type confusion in higher layers surface in tests rather than only against Cohere / Voyage in production).

Output is unit-normalized so similarity tests don't need to know the magnitude of the seed expansion.

Examples:

zero-config

Parse::Embeddings.provider(:fixture).embed_text(["hello"])
# => [[0.012, -0.043, ...]]   # length == 64 (default)

custom dimensions

provider = Parse::Embeddings::Fixture.new(dimensions: 1536)
Parse::Embeddings.register(:openai_stub, provider)

Constant Summary collapse

DEFAULT_DIMENSIONS =
64
DEFAULT_MODEL_NAME =
"fixture-deterministic"
MAX_DIMENSIONS =

Matches Parse::Vector::MAX_DIMENSIONS — keeps a runaway test constructor (Fixture.new(dimensions: 10_000_000)) from hanging the suite on the SHA-256 chain expansion.

16_384

Constants inherited from Provider

Provider::AS_NOTIFICATION_NAME

Instance Method Summary collapse

Methods inherited from Provider

#embed_batch_size, #embed_image, #embed_text_batched, #inspect, #inspect_attrs, #instrument_embed, #max_input_tokens, #modalities, #validate_response!

Constructor Details

#initialize(dimensions: DEFAULT_DIMENSIONS, model_name: DEFAULT_MODEL_NAME) ⇒ Fixture

Returns a new instance of Fixture.

Parameters:

  • dimensions (Integer) (defaults to: DEFAULT_DIMENSIONS)

    output vector width (1..16384). Choose to match the production provider you're stubbing.

  • model_name (String) (defaults to: DEFAULT_MODEL_NAME)

    identifier persisted to embedding_meta and used in cache keys.



40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/parse/embeddings/fixture.rb', line 40

def initialize(dimensions: DEFAULT_DIMENSIONS, model_name: DEFAULT_MODEL_NAME)
  unless dimensions.is_a?(Integer) && dimensions.positive?
    raise ArgumentError,
          "Parse::Embeddings::Fixture: dimensions must be a positive Integer (got #{dimensions.inspect})."
  end
  if dimensions > MAX_DIMENSIONS
    raise ArgumentError,
          "Parse::Embeddings::Fixture: dimensions #{dimensions} exceeds MAX_DIMENSIONS (#{MAX_DIMENSIONS})."
  end
  @dimensions = dimensions
  @model_name = model_name.to_s
end

Instance Method Details

#dimensionsObject



53
54
55
# File 'lib/parse/embeddings/fixture.rb', line 53

def dimensions
  @dimensions
end

#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>

Returns one unit vector per input.

Parameters:

  • strings (Array<String>)

    inputs.

  • input_type (Symbol) (defaults to: :search_document)

    :search_query or :search_document (or any symbol — Fixture treats them as independent seeds).

Returns:

  • (Array<Array<Float>>)

    one unit vector per input.



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/parse/embeddings/fixture.rb', line 73

def embed_text(strings, input_type: :search_document)
  unless strings.is_a?(Array)
    raise ArgumentError,
          "Parse::Embeddings::Fixture#embed_text expects Array<String> (got #{strings.class})."
  end
  return [] if strings.empty?
  type_tag = input_type.to_s
  # Validate inputs BEFORE entering the instrument block so a
  # caller-shape error isn't recorded as a successful embed in
  # AS::N. The fixture has no network call, but emitting the
  # event keeps subscriber wiring uniform across providers —
  # operators developing against the Fixture see the same event
  # tree they'll see in production against OpenAI.
  strings.each do |s|
    unless s.is_a?(String)
      raise ArgumentError,
            "Parse::Embeddings::Fixture#embed_text element must be String (got #{s.class})."
    end
  end
  instrument_embed(strings.length, input_type) do |_emit_payload|
    vectors = strings.map { |s| seeded_unit_vector("#{@model_name}\0#{type_tag}\0#{s}") }
    validate_response!(strings.length, vectors)
  end
end

#model_nameObject



57
58
59
# File 'lib/parse/embeddings/fixture.rb', line 57

def model_name
  @model_name
end

#normalize?Boolean

Returns:

  • (Boolean)


61
62
63
# File 'lib/parse/embeddings/fixture.rb', line 61

def normalize?
  true
end

#supports_input_type?Boolean

Returns:

  • (Boolean)


65
66
67
# File 'lib/parse/embeddings/fixture.rb', line 65

def supports_input_type?
  true
end