ユーザーが回答生成時に参照させたいドキュメントのファイル一式を、S3にデータソース(図中のData Source)として登録します。
Knowledge baseはドキュメントをチャンクに分割(Splitting int chunks)、埋め込みモデル(Embedding Model)を用いてベクトル変換(Generating Embeddings)します。その結果をベクターDB(Vector DB)に保存します。
このようにすることで、次段の質問処理で、質問文と関連のある文章をスピーディーに検索することができるようになります。
質問処理
Knowledge baseは、ユーザの質問(User Query)を受け取ったら、前処理に使ったのと同じ埋め込みモデル(Embedding Model)を用いて質問をベクトル変換(Generating Embeddings)し、この質問文のベクトルと前処理で構築したベクターDBにあるベクトルたちと比較することで、質問文に一番近いドキュメントを引き当てます(Retrieve similar documents)。
文章モデル(Text Model)は、この引き当てられたドキュメントとユーザの質問文を拡張して(Augment User Query with retrieved documents)、ユーザへの応答を生成し、返します(Respond to User) 。
Cat Indices APIで件数を確認してみます。1,000,000件入っていることが確認できます。
(レプリカ数)
# GET _cat/indices/single_vector_test?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size dataset.size
green open single_vector_test kNg4GFrKSzKrChb8EELNxA 1 0 1000000 0 33.4gb 33.4gb 33.4gb
検索処理
Elastisearchには正確に近傍検索をおこなう kNN と、近似計算で高速に処理をする ANN の2パターンがサポートされています。
from openai import OpenAI
defmain():
client = OpenAI(
api_key= 'API Keyを入れる'
)
response = client.chat.completions.create(
model="gpt-4-vision-preview",
max_tokens=1024,
messages=[
{
"role": "system",
"content": "You are an Optical Character Recognition (OCR) machine. You will extract all the characters from the image file in the URL provided by the user, and you will only provide the extracted text in your response. As an OCR machine, You can only respond with the extracted text."
},
{
"role": "user",
"content": [
{"type": "text", "text": "Please extract all characters within the image. Return the only extracted characters."},
{
"type": "image_url",
"image_url": {
"url": "画像のURL",
},
},
],
},
]
)
print(response.choices[0].message.content)
if __name__ == "__main__":
main()
ON COMPUTABLE NUMBERS, WITH AN APPLICATION TO THE ENTSCHEIDUNGSPROBLEM
By A. M. TURING.
[Received 28 May, 1936.—Read 12 November, 1936.]
The “computable” numbers may be described briefly as the real numbers whose expressions as a decimal are calculable by finite means.
Although the subject of this paper is ostensibly the computable numbers, it is almost equally easy to define and investigate computable
functions of an integral variable or a real or computable variable, computable predicates, and so forth.
The fundamental problems involved are, however, the same in each case, and I have chosen the computable numbers for explicit treatment as
involving the least cumbrous technique. I hope shortly to give an account of the relations of the computable numbers, functions, and so forth to one another.
This will include a development of the theory of functions of a real variable expressed in terms of computable numbers.
According to my definition, a number is computable if its decimal can be written down by a machine.
In §§ 9, 10 I give some arguments with the intention of showing that the computable numbers include all numbers which could naturally be regarded as computable.
In particular, I show that certain large classes of numbers are computable. They include, for instance, the real parts of all algebraic numbers, the real
parts of the zeros of the Bessel functions, the numbers π, e, etc.
The computable numbers do not, however, include all definable numbers, an example is given of a definable number which is not computable.
Although the class of computable numbers is so great, and in many ways similar to the class of real numbers, it is nevertheless enumerable.
In §§ I examine certain arguments which would seem to prove the contrary.
By the correct application of one of these arguments, conclusions are reached which are superficially similar to those of Gödel’s. These results
Gödel, “über formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme I,” Monatshefte Math. Phys., 38 (1931), 173-198.