手機(jī)站首頁(yè)散文詩(shī)歌雜文隨筆日記小小說

散文網(wǎng) » 生活 »日常 » 學(xué)習(xí)日志 211228

學(xué)習(xí)日志 211228

2021-12-28 17:32 作者:mayoiwill 0人讀過 | 我要投稿

elasticsearch基礎(chǔ)學(xué)習(xí)

========================

# 211228

# 擴(kuò)容pvc

- 參考?

? - https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-volume-claim-templates.html

- 在k8s描述文件中增加pvc段落

- 重新apply -f 發(fā)現(xiàn)報(bào)錯(cuò) 不允許修改

- 刪除現(xiàn)有集群

? - kubectl delete elasticsearch quickstart

- 重新apply -f 成功

- 檢查pvc

? - kubectl get pvc

? - 已擴(kuò)充到10Gi

# 索引基本使用

- 參考 https://learnku.com/docs/elasticsearch73/7.3/index-some-documents/6450

## 創(chuàng)建索引

- 采用直接PUT一個(gè)doc的方式

? - 當(dāng)該索引不存在時(shí) 會(huì)自動(dòng)創(chuàng)建索引

- 指令

```

PUT /customer/_doc/1

{

? "name": "John Doe"

}

```

- 指令解釋

? - PUT 新增

? - /customer 是索引名(類似于表名) 比如我們改為 test_doc

? - /_doc 這個(gè)是內(nèi)置接口指針對(duì)索引的doc進(jìn)行操作

? - /1 表示操作的id是1

? - 內(nèi)容是一個(gè)json

? - 可以有多個(gè)字段

? ? - 字段能否是多級(jí)的?

- 結(jié)果

? - 創(chuàng)建了一個(gè)名為 test_doc 的索引

? - 在這個(gè)索引中增加了一條id為1的數(shù)據(jù)

? - 該數(shù)據(jù)有id 和 name兩個(gè)字段

? - 系統(tǒng)自動(dòng)為該索引創(chuàng)建了mapping

? ? - 見下

- 獲取數(shù)據(jù)

? - `GET /test_doc/_doc/1`

? - 采用_search接口

? ? ```

? ? GET /test_doc/_search

? ? {

? ? ? "query": {

? ? ? ? "match_all": {}

? ? ? }

? ? }

? ? ```

? - 具體_search內(nèi)置接口的語(yǔ)法見下面mapping部分

## 操作數(shù)據(jù)

- 更新數(shù)據(jù)

? ? - 重新PUT數(shù)據(jù)即可

? ? - 檢查

? ? ? ? - 使用GET

? ? ? ? - 內(nèi)容已更新

? ? ? ? - _version內(nèi)置字段也更新為2

- 刪除數(shù)據(jù)

? ? - `DELETE /test_doc/_doc/1`

? ? - 檢查

? ? ? ? - GET

? ? ? ? ? ? - found : false

? ? ? ? - 走_(dá)search

? ? ? ? ? ? - hits.total.value = 0

- 批量插入

? - _bulk接口

? ? ```

? ? curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"

? ? ```

? - 這個(gè)不試了可能需要加-u指定用戶密碼

- 檢查各個(gè)索引占用的空間等情況

? - `GET /_cat/indices?v`

? - 有一些內(nèi)置的數(shù)據(jù)

? - 包括kibana的一些數(shù)據(jù)

## 檢查索引

- 基本搜索

? - 參考 https://learnku.com/docs/elasticsearch73/7.3/start-searching/6451

? - "from" "size" 可以分頁(yè)

- 聚合數(shù)據(jù)

? - 參考 https://learnku.com/docs/elasticsearch73/7.3/analyze-results-with-aggregations/6452

? - 類似于 SQL 的 group by

-?

## 修改mapping

- 參考

? - https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#mapping-dynamic

- 修改mapping

? - update mapping API

? - https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html

- 采用和mapping類型相符的搜索條件

? - doc字段類型是text 可以使用match

? ? ```

? ? GET /test_doc/_search

? ? {

? ? ? "query": {

? ? ? ? "match": {

? ? ? ? ? "doc": "test"

? ? ? ? }

? ? ? }

? ? }

? ? ```

? ? - te - 不行原因是按英語(yǔ)做tokenize 沒有te這個(gè)token

? ? - test 1 - 可以

? - 使用match_phrase

? ? - test 可以

? ? - test 1 - 不行沒有連續(xù)出現(xiàn)

? - doc.keyword 類型 keyword

? ? - test - 不行 keyword這里必須全文匹配

## mapping的動(dòng)態(tài)模版

## 顯示指定mapping 索引類型

## 運(yùn)行時(shí)字段

- 把mapping的dynamic設(shè)為runtime

? - 這樣新增字段都是runtime了避免索引因?yàn)樾略鲎侄味兇?/p>

? - 默認(rèn)是true, 則新增字段都會(huì)建索引

- 索引建出來后 properties字段就不能刪了

? - 只能做reindex

- 刪除 mapping 中 runtime 字段

? ```

? PUT my-index-000001/_mapping

? {

? ? "runtime": {

? ? ? "day_of_week": null

? ? }

? }

? ```

? - 結(jié)果_source還在的

? - 但是該字段已經(jīng)不能作為match條件使用了

- 加回來 PUT mapping

? ```

? PUT /test_doc/_mapping

? {

? ? "runtime": {

? ? ? "author": {

? ? ? ? "type":"keyword"

? ? ? }

? ? }

? }

? ```

??

## 在_search里做runtime_mapping

## 給runtime字段建索引

- 參考

? - https://www.elastic.co/guide/en/elasticsearch/reference/current/runtime-indexed.html

- runtime字段實(shí)際上是不被索引的

- 建個(gè)新的索引

? - 把原來runtime字段的定義復(fù)制到新索引的properties里

- 重新_bulk數(shù)據(jù)進(jìn)去

? - 其它加數(shù)據(jù)的方式?

- 舊索引刪了

## grok pattern

- 可以識(shí)別`'%{COMMONAPACHELOG}'`

# 分析器

## 字段類型

- 重點(diǎn)關(guān)注text大類

? - text

? - annotated-text

? - completion

? - search_as_you_type

? - token_count

- Document ranking type

? - dense_vector

? - sparse_vector

? - rank_feature

? - rank_features

- 特殊類型 geo 地理位置索引?

- 所有值都可以是數(shù)組

? - 字段取值可以是 `"aaa"` 也可以是 `["aaa","bbb"]`

? - 如果查詢條件是match "aaa" 上述都成立

? - 這個(gè)能力適合以下場(chǎng)景

? ? - 論文表 + 作者表

? ? - 論文:作者是 1:N 有個(gè) 論文_作者關(guān)系表

? ? - 進(jìn)索引后, 作者字段就直接用數(shù)組

? ? - 這樣直接支持比如標(biāo)題和作者兩個(gè)條件的查詢了

? ? - 這就涉及原始數(shù)據(jù)庫(kù)表如何轉(zhuǎn)換為索引的doc

? ? - 后續(xù)我們研究如何用flink做類似轉(zhuǎn)換

?

## 理解分析器

- 分析器由三個(gè)模塊組成

? - 0-N個(gè)字符處理器

? - 1個(gè)分詞器 (tokenizer)

? - 0-N個(gè)詞(token)處理器

- 分析器可以作用在索引構(gòu)建階段或查詢階段

? - 某些分析器僅能在查詢階段使用

? - 一般要求構(gòu)建時(shí)和搜索時(shí)使用相同的分析器

? - 但搜索時(shí)指定單獨(dú)的分析器也是有道理的

? ? - 這種情況下該分析器映射出來的條件一般來講更嚴(yán)格

? ? - 比如構(gòu)建時(shí) apple 可以分出 a ap app ...

? ? - 但是搜索時(shí) appli 只能是appli

- 詞干化stemming (詞處理器)

? - 基于字典的效果好性能差

? - snowball 常用

? - 輔以 keyword_marker 等自定義詞干化的過程

- token graph

? - 只有synonym_graph 和word_delimiter_graph才支持

? - 不帶_graph后綴的不支持

### 測(cè)試分析器

- 只使用 `_analyze`

? ```

? POST _analyze

? {

? ? "analyzer": "whitespace",

? ? "text":? ? ?"The quick brown fox."

? }

? ```?

- 為索引的某個(gè)字段指定分析器

? - 參考 https://www.elastic.co/guide/en/elasticsearch/reference/current/test-analyzer.html

? - 字段的 type設(shè)為text 同級(jí) 設(shè)定 analyzer 為自定義的analyzer的名字

? - 在settings.analysis.analyzer下自定義analyzer

? - 測(cè)試時(shí) 指定索引名/_analyze

? - 指定field和text(沒有實(shí)際數(shù)據(jù)也可)

- TODO

## 自定義同義詞替換

## 應(yīng)用分析器到索引

標(biāo)簽：

學(xué)習(xí)日志 211228的評(píng)論 (共條)

愛情散文傷感散文哲理散文優(yōu)美生活隨筆親情唯美句子傷感的句子現(xiàn)代詩(shī)歌空間日志經(jīng)典語(yǔ)句愛情句子作文大全

最美情侣中文字幕电影,在线麻豆精品传媒,在线网站高清黄,久久黄色视频

學(xué)習(xí)日志 211228

學(xué)習(xí)日志 211228的評(píng)論 (共條)

你可能也喜歡這些文章

最新發(fā)布的文章

最美情侣中文字幕电影,在线麻豆精品传媒,在线网站高清黄,久久黄色视频

學(xué)習(xí)日志 211228

本文作者的其他文章

學(xué)習(xí)日志 211228的評(píng)論 (共 條)

你可能也喜歡這些文章

最新發(fā)布的文章

學(xué)習(xí)日志 211228的評(píng)論 (共條)