Class: Parse::Embeddings::Cohere
- Defined in:
- lib/parse/embeddings/cohere.rb
Overview
Cohere embeddings provider. Wraps POST /v1/embed.
Supported models:
- v4 —
embed-v4.0(1536 native, Matryoshka 512, 1024, 1536, 128k-token context). Unified text + image model at the network boundary; this provider exposes the text-input path only — image inputs will land in v5.1 alongside the Provider#embed_image hook. - v3 —
embed-english-v3.0,embed-multilingual-v3.0(both 1024-dim),embed-english-light-v3.0,embed-multilingual-light-v3.0(both 384-dim). Text-only.
Asymmetric input types
Cohere is one of the providers that DOES distinguish queries from
documents at the wire level via the input_type request field.
Sending input_type: "search_query" for a query and
"search_document" for a corpus item is required for good recall
on Cohere's v3 models — using the same type for both halves of a
retrieval pair degrades nDCG by a noticeable margin (Cohere's own
benchmarks). Provider#supports_input_type? returns true here
so callers / cache-keying middleware can branch on this.
The accepted Symbol values map to the Cohere wire strings:
:search_query→"search_query":search_document→"search_document":classification→"classification":clustering→"clustering"
Security
- The Faraday connection refuses
proxy:unless the caller opts in viaallow_faraday_proxy: true. Env-proxy autodiscovery (HTTPS_PROXYetc.) is suppressed by default — same model asParse::Clientand OpenAI. #inspect(inherited from Provider) never surfaces@api_key.AuthorizationandCohere-Api-Keyare in Middleware::BodyBuilder::REDACTED_HEADERS.
Defined Under Namespace
Classes: AuthenticationError, BadRequestError, RateLimitError, TransientError
Constant Summary collapse
- DEFAULT_BASE_URL =
"https://api.cohere.com/v1"- DEFAULT_MODEL =
"embed-english-v3.0"- DEFAULT_TIMEOUT =
30- DEFAULT_OPEN_TIMEOUT =
5- DEFAULT_MAX_RETRIES =
3- DEFAULT_BATCH_SIZE =
Cohere documents a hard cap of 96 inputs per
/embedcall. 96- MAX_RESPONSE_BYTES =
16 * 1024 * 1024
- MODEL_DEFAULT_DIMENSIONS =
{ "embed-v4.0" => 1536, "embed-english-v3.0" => 1024, "embed-multilingual-v3.0" => 1024, "embed-english-light-v3.0" => 384, "embed-multilingual-light-v3.0" => 384, }.freeze
- MODEL_MAX_INPUT_TOKENS =
{ "embed-v4.0" => 128_000, "embed-english-v3.0" => 512, "embed-multilingual-v3.0" => 512, "embed-english-light-v3.0" => 512, "embed-multilingual-light-v3.0" => 512, }.freeze
- MATRYOSHKA_MODELS =
Models that accept Cohere's
output_dimensionMatryoshka truncation parameter. v4.0 is the only such row today; v3 models reject the field with a 400. %w[embed-v4.0].freeze
- MATRYOSHKA_WIDTHS =
Allowed Matryoshka widths per model (Cohere quantizes the available truncations rather than accepting any integer ≤ native). Empty allowlist = any integer ≤ native is fine, but for v4.0 Cohere documents exactly these four widths.
{ "embed-v4.0" => [256, 512, 1024, 1536].freeze, }.freeze
- INPUT_TYPE_WIRE_VALUES =
Map SDK-canonical input_type symbols to Cohere wire strings. Symbols outside this set raise — silently downgrading
:unknown_typeto"search_document"would mask cache-key bugs in higher layers (the value participates in cache keys). { search_query: "search_query", search_document: "search_document", classification: "classification", clustering: "clustering", }.freeze
Constants inherited from Provider
Provider::AS_NOTIFICATION_NAME
Instance Method Summary collapse
- #dimensions ⇒ Object
- #embed_batch_size ⇒ Object
-
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Vectors aligned 1:1 with
strings. -
#initialize(api_key:, model: DEFAULT_MODEL, dimensions: nil, base_url: DEFAULT_BASE_URL, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_faraday_proxy: false, allow_insecure_base_url: false, connection: nil) ⇒ Cohere
constructor
A new instance of Cohere.
- #inspect_attrs ⇒ Object
- #max_input_tokens ⇒ Object
- #model_name ⇒ Object
- #normalize? ⇒ Boolean
- #supports_input_type? ⇒ Boolean
Methods inherited from Provider
#embed_image, #embed_text_batched, #inspect, #instrument_embed, #modalities, #validate_response!
Constructor Details
#initialize(api_key:, model: DEFAULT_MODEL, dimensions: nil, base_url: DEFAULT_BASE_URL, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_faraday_proxy: false, allow_insecure_base_url: false, connection: nil) ⇒ Cohere
Returns a new instance of Cohere.
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/parse/embeddings/cohere.rb', line 130 def initialize( api_key:, model: DEFAULT_MODEL, dimensions: nil, base_url: DEFAULT_BASE_URL, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_faraday_proxy: false, allow_insecure_base_url: false, connection: nil ) validate_api_key!(api_key) validate_model!(model) validate_dimensions!(model, dimensions) sanitized_base_url = validate_base_url!(base_url, allow_insecure_base_url) validate_positive_integer!(:timeout, timeout) validate_positive_integer!(:open_timeout, open_timeout) validate_non_negative_integer!(:max_retries, max_retries) validate_positive_integer!(:embed_batch_size, ) if > 96 raise ArgumentError, "Parse::Embeddings::Cohere: embed_batch_size #{} exceeds Cohere's per-request cap (96)." end @api_key = api_key @model = model @dimensions = dimensions || MODEL_DEFAULT_DIMENSIONS.fetch(model) @base_url = sanitized_base_url @timeout = timeout @open_timeout = open_timeout @max_retries = max_retries @embed_batch_size = @allow_faraday_proxy = allow_faraday_proxy @connection = connection || build_connection end |
Instance Method Details
#dimensions ⇒ Object
168 169 170 |
# File 'lib/parse/embeddings/cohere.rb', line 168 def dimensions @dimensions end |
#embed_batch_size ⇒ Object
176 177 178 |
# File 'lib/parse/embeddings/cohere.rb', line 176 def @embed_batch_size end |
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Returns vectors aligned 1:1 with strings.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 |
# File 'lib/parse/embeddings/cohere.rb', line 196 def (strings, input_type: :search_document) unless strings.is_a?(Array) raise ArgumentError, "Parse::Embeddings::Cohere#embed_text expects Array<String> (got #{strings.class})." end return [] if strings.empty? strings.each_with_index do |s, i| unless s.is_a?(String) raise ArgumentError, "Parse::Embeddings::Cohere#embed_text strings[#{i}] is not a String (#{s.class})." end if s.empty? raise ArgumentError, "Parse::Embeddings::Cohere#embed_text strings[#{i}] is empty; Cohere rejects empty inputs." end end wire_input_type = INPUT_TYPE_WIRE_VALUES[input_type] unless wire_input_type raise ArgumentError, "Parse::Embeddings::Cohere#embed_text input_type #{input_type.inspect} not in " \ "#{INPUT_TYPE_WIRE_VALUES.keys.inspect}." end body = { texts: strings, model: @model, input_type: wire_input_type, embedding_types: ["float"], } # Forward `output_dimension` only for Matryoshka-capable models # whose active width differs from native. Sending it to a v3 # row would yield a 400 from Cohere. if MATRYOSHKA_MODELS.include?(@model) && @dimensions != MODEL_DEFAULT_DIMENSIONS.fetch(@model) body[:output_dimension] = @dimensions end (strings.length, input_type) do |emit_payload| payload = (body) # Cohere's response carries `meta.billed_units.input_tokens` # (and `output_tokens`, though for embeddings it's 0). Forward # input_tokens as the operator-facing cost number on the AS::N # payload so cost subscribers can budget across providers. if payload.is_a?(Hash) && payload["meta"].is_a?(Hash) && payload["meta"]["billed_units"].is_a?(Hash) tt = payload["meta"]["billed_units"]["input_tokens"] emit_payload[:total_tokens] = tt if tt.is_a?(Integer) && tt >= 0 end vectors = extract_vectors!(payload, strings.length) validate_response!(strings.length, vectors) end end |
#inspect_attrs ⇒ Object
249 250 251 |
# File 'lib/parse/embeddings/cohere.rb', line 249 def inspect_attrs super.merge(base: safe_base_host, retries: @max_retries) end |
#max_input_tokens ⇒ Object
180 181 182 |
# File 'lib/parse/embeddings/cohere.rb', line 180 def max_input_tokens MODEL_MAX_INPUT_TOKENS[@model] end |
#model_name ⇒ Object
172 173 174 |
# File 'lib/parse/embeddings/cohere.rb', line 172 def model_name @model end |
#normalize? ⇒ Boolean
184 185 186 187 |
# File 'lib/parse/embeddings/cohere.rb', line 184 def normalize? # Cohere v3 embeddings are documented unit-normalized. true end |
#supports_input_type? ⇒ Boolean
189 190 191 |
# File 'lib/parse/embeddings/cohere.rb', line 189 def supports_input_type? true end |