Class: Parse::Embeddings::LocalHTTP
- Defined in:
- lib/parse/embeddings/local_http.rb
Overview
Generic OpenAI-compatible local embedding provider. Talks to any
server that exposes POST <base_url>/embeddings with the OpenAI
request/response shape — covers Ollama (/v1), LM Studio (/v1),
vLLM, llama.cpp's server, and any reverse-proxy that translates
to a local model runner.
SSRF gate
The base_url is resolved at construction time and the resolved
addresses are checked against File::BLOCKED_CIDRS
(loopback, RFC1918, link-local, cloud-metadata, CGNAT, IPv6 ULA,
…). When ANY resolved address falls in a private/internal range,
the constructor refuses unless the caller opts in via
allow_private_endpoint: true.
The opt-in is a deliberate, audit-able gate — Parse::Embeddings
registration is configuration code, not user input, so opting in
to "yes, this base_url really is my Ollama on localhost" is a
one-line decision by the operator at boot time. A Kernel#warn
fires when the opt-in is taken so the choice shows up in operator
logs / bundle exec rake about output.
http:// base URLs are accepted with allow_private_endpoint: true
(the typical local-runner deployment), and refused otherwise unless
the caller also passes allow_insecure_base_url: true (escape
hatch for self-signed internal HTTPS proxies fronted by http://).
Why no fixed model whitelist
Ollama, LM Studio, and vLLM all serve operator-chosen models —
we cannot enumerate "supported" models the way OpenAI can. The
constructor instead takes the dimensions: explicitly, and the
provider's Provider#validate_response! (inherited) enforces that every
returned vector matches that width. Mis-specified dimensions
surface as InvalidResponseError on the first embed call.
Security
Defined Under Namespace
Classes: AuthenticationError, BadRequestError, RateLimitError, TransientError
Constant Summary collapse
- DEFAULT_TIMEOUT =
30- DEFAULT_OPEN_TIMEOUT =
5- DEFAULT_MAX_RETRIES =
3- DEFAULT_BATCH_SIZE =
32- MAX_RESPONSE_BYTES =
16 * 1024 * 1024
Constants inherited from Provider
Provider::AS_NOTIFICATION_NAME
Instance Method Summary collapse
- #dimensions ⇒ Object
- #embed_batch_size ⇒ Object
-
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Vectors aligned 1:1 with
strings. -
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
constructor
A new instance of LocalHTTP.
- #inspect_attrs ⇒ Object
- #model_name ⇒ Object
- #normalize? ⇒ Boolean
- #supports_input_type? ⇒ Boolean
Methods inherited from Provider
#embed_image, #embed_text_batched, #inspect, #instrument_embed, #max_input_tokens, #modalities, #validate_response!
Constructor Details
#initialize(base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil) ⇒ LocalHTTP
Returns a new instance of LocalHTTP.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
# File 'lib/parse/embeddings/local_http.rb', line 114 def initialize( base_url:, model:, dimensions:, api_key: nil, normalize: false, timeout: DEFAULT_TIMEOUT, open_timeout: DEFAULT_OPEN_TIMEOUT, max_retries: DEFAULT_MAX_RETRIES, embed_batch_size: DEFAULT_BATCH_SIZE, allow_private_endpoint: false, allow_insecure_base_url: false, allow_faraday_proxy: false, connection: nil ) validate_model!(model) validate_dimensions!(dimensions) validate_optional_api_key!(api_key) unless [true, false].include?(normalize) raise ArgumentError, "Parse::Embeddings::LocalHTTP: normalize must be true or false (got #{normalize.inspect})." end validate_positive_integer!(:timeout, timeout) validate_positive_integer!(:open_timeout, open_timeout) validate_non_negative_integer!(:max_retries, max_retries) validate_positive_integer!(:embed_batch_size, ) sanitized_base_url, resolved_addrs, is_private = validate_base_url_and_gate_ssrf!(base_url, allow_private_endpoint: allow_private_endpoint, allow_insecure_base_url: allow_insecure_base_url) if is_private # Audit log. Emits once per instance — Kernel#warn so it lands # on stderr and any logger that captures it. Operators running # a hardened environment can grep this to confirm every # private-endpoint opt-in was intentional. warn "Parse::Embeddings::LocalHTTP: allow_private_endpoint=true for #{sanitized_base_url} — " \ "resolved to private address(es) #{resolved_addrs.map(&:to_s).inspect}." end @base_url = sanitized_base_url @model = model @dimensions = dimensions @api_key = api_key @normalize = normalize @timeout = timeout @open_timeout = open_timeout @max_retries = max_retries @embed_batch_size = @allow_faraday_proxy = allow_faraday_proxy @connection = connection || build_connection end |
Instance Method Details
#dimensions ⇒ Object
167 168 169 |
# File 'lib/parse/embeddings/local_http.rb', line 167 def dimensions @dimensions end |
#embed_batch_size ⇒ Object
175 176 177 |
# File 'lib/parse/embeddings/local_http.rb', line 175 def @embed_batch_size end |
#embed_text(strings, input_type: :search_document) ⇒ Array<Array<Float>>
Returns vectors aligned 1:1 with strings.
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
# File 'lib/parse/embeddings/local_http.rb', line 196 def (strings, input_type: :search_document) unless strings.is_a?(Array) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text expects Array<String> (got #{strings.class})." end return [] if strings.empty? strings.each_with_index do |s, i| unless s.is_a?(String) raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is not a String (#{s.class})." end if s.empty? raise ArgumentError, "Parse::Embeddings::LocalHTTP#embed_text strings[#{i}] is empty; local runners typically reject empty inputs." end end body = { input: strings, model: @model } (strings.length, input_type) do |emit_payload| payload = (body) # Local runners may or may not include `usage`. When present, # forward total_tokens to the AS::N payload. if payload.is_a?(Hash) && payload["usage"].is_a?(Hash) tt = payload["usage"]["total_tokens"] emit_payload[:total_tokens] = tt if tt.is_a?(Integer) && tt >= 0 end vectors = extract_vectors!(payload, strings.length) validate_response!(strings.length, vectors) end end |
#inspect_attrs ⇒ Object
228 229 230 |
# File 'lib/parse/embeddings/local_http.rb', line 228 def inspect_attrs super.merge(base: safe_base_host, retries: @max_retries) end |
#model_name ⇒ Object
171 172 173 |
# File 'lib/parse/embeddings/local_http.rb', line 171 def model_name @model end |
#normalize? ⇒ Boolean
179 180 181 |
# File 'lib/parse/embeddings/local_http.rb', line 179 def normalize? @normalize end |
#supports_input_type? ⇒ Boolean
183 184 185 186 187 188 189 190 |
# File 'lib/parse/embeddings/local_http.rb', line 183 def supports_input_type? # The OpenAI-compatible local runners do not asymmetrize. Some # models (bge-*) have a documented query prefix, but the local # server itself doesn't expose `input_type:` — callers wrap the # query text instead. We accept the kwarg for cache-key stability # but drop it at the wire level. false end |