MySQL 啟動報錯:File ./mysql-bin.index not found (Errcode: 13)。本站提示廣大學習愛好者:(MySQL 啟動報錯:File ./mysql-bin.index not found (Errcode: 13))文章只能為提供參考,不一定能成為您想要的結果。以下是MySQL 啟動報錯:File ./mysql-bin.index not found (Errcode: 13)正文
Elasticsearch是一個散布式、Restful的搜刮及剖析辦事器,Apache Solr一樣,它也是基於Lucence的索引辦事器,但我以為Elasticsearch比較Solr的長處在於:
情況搭建
啟動Elasticsearch,拜訪端口在9200,經由過程閱讀器可以檢查到前往的JSON數據,Elasticsearch提交和前往的數據格局都是JSON.
>> bin/elasticsearch -f
裝置官方供給的Python API,在OS X上裝置後湧現一些Python運轉毛病,是由於setuptools版本太舊惹起的,刪除重裝後恢復正常。
>> pip install elasticsearch
索引操作
關於單條索引,可以挪用create或index辦法。
from datetime import datetime
from elasticsearch import Elasticsearch
es = Elasticsearch() #create a localhost server connection, or Elasticsearch("ip")
es.create(index="test-index", doc_type="test-type", id=1,
body={"any":"data", "timestamp": datetime.now()})
Elasticsearch批量索引的敕令是bulk,今朝Python API的文檔示例較少,花了很多時光浏覽源代碼才弄清晰批量索引的提交格局。
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch import helpers
es = Elasticsearch("10.18.13.3")
j = 0
count = int(df[0].count())
actions = []
while (j < count):
action = {
"_index": "tickets-index",
"_type": "tickets",
"_id": j + 1,
"_source": {
"crawaldate":df[0][j],
"flight":df[1][j],
"price":float(df[2][j]),
"discount":float(df[3][j]),
"date":df[4][j],
"takeoff":df[5][j],
"land":df[6][j],
"source":df[7][j],
"timestamp": datetime.now()}
}
actions.append(action)
j += 1
if (len(actions) == 500000):
helpers.bulk(es, actions)
del actions[0:len(actions)]
if (len(actions) > 0):
helpers.bulk(es, actions)
del actions[0:len(actions)]
在這裡發明Python API序列化JSON時對數據類型支持比擬無限,原始數據應用的NumPy.Int32必需轉換為int能力索引。另外,如今的bulk操作默許是每次提交500條數據,我修正為5000乃至50000停止測試,會有索引不勝利的情形。
#helpers.py source code
def streaming_bulk(client, actions, chunk_size=500, raise_on_error=False,
expand_action_callback=expand_action, **kwargs):
actions = map(expand_action_callback, actions)
# if raise on error is set, we need to collect errors per chunk before raising them
errors = []
while True:
chunk = islice(actions, chunk_size)
bulk_actions = []
for action, data in chunk:
bulk_actions.append(action)
if data is not None:
bulk_actions.append(data)
if not bulk_actions:
return
def bulk(client, actions, stats_only=False, **kwargs):
success, failed = 0, 0
# list of errors to be collected is not stats_only
errors = []
for ok, item in streaming_bulk(client, actions, **kwargs):
# go through request-reponse pairs and detect failures
if not ok:
if not stats_only:
errors.append(item)
failed += 1
else:
success += 1
return success, failed if stats_only else errors
關於索引的批量刪除和更新操作,對應的文檔格局以下,更新文檔中的doc節點是必需的。
{
'_op_type': 'delete',
'_index': 'index-name',
'_type': 'document',
'_id': 42,
}
{
'_op_type': 'update',
'_index': 'index-name',
'_type': 'document',
'_id': 42,
'doc': {'question': 'The life, universe and everything.'}
}
罕見毛病
機能

下面是應用MongoDB和Elasticsearch存儲雷同數據的比較,固然辦事器和操作方法都不完整雷同,但可以看出數據庫對批量寫入照樣比索引辦事器更具有優勢。
Elasticsearch的索引文件是主動分塊,到達萬萬級數據對寫入速度也沒有影響。但在到達磁盤空間下限時,Elasticsearch湧現了文件歸並毛病,而且年夜量喪失數據(共丟了100多萬條),停滯客戶端寫入後,辦事器也沒法主動恢復,必需手動停滯。在臨盆情況中這點比擬致命,特別是應用非Java客戶端,仿佛沒法在客戶端獲得到辦事真個Java異常,這使得法式員必需很當心地處置辦事真個前往信息。