catalog

ElasticSearch

install

ElasticSearch

docker pull elasticsearch:7.4.2

Kibana for visualization

docker pull kibana:

function

Prepare es

 mkdir -p /mydata/elasticsearch/config #Store ES configuration
 
 mkdir -p /mydata/elasticsearch/data #Store ES data
 
 chmod -R 777 /mydata/elasticsearch #Modify permission

#Enter config directory
 echo "http.host: 0.0.0.0">>/mydata/elasticsearch/config/elasticsearch.yml #Write remotely accessible to profile

Running es

# 9200 is the http port that we send requests to es, and 9300 is the communication port between ES cluster nodes
docker run --name es -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \ #Single node settings
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \ #Memory usage, configured for development testing
#### mount ###
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \	#to configure
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \ #data
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \ #plug-in unit
-d elasticsearch:7.4.2

Running kibana

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://123.56.16.54:9200 -p 5601:5601 -d kibana:7.4.2

Preliminary search

  • GET request:_ cat/nodes
  • GET request:_ cat/health
  • GET request:_ cat/master View master node
  • GET request:_ Cat / indexes view index

New document

  • PUT request: index/type/id (the id does not exist) new data, (the id exists) update data

  • POST request: index/type/id (without id, randomly specified) new data, (with and existing id) update data

consult your documentation

  • GET request: index/type/id lookup

    Optimistic lock, bring "if" when you insert_ Sq_ no=1&if_ primary_ term=1

Update document

  • POST request: index/type/id/_update

    Method 1: if the updated data is the same as the last time, the version number will not be increased, indicating that noop has no operation

Method 2: without_ Update. The same update can also increase the version number. Note that json data is written differently

  • PUT request: index/type/id
    • A kind of update can only be a post request
    • Will not check if it is the same as the last update

Delete document / index

  • DELETE request: index/type/id
  • DELETE request: index

Batch operation

Two lines and one data. There is no request body but one line for deletion

Advanced Search

Search API

#Method 1 uri + retrieve parameters
GET index01/_search?q=*&sort=_id:asc
#Method 2 uri + request body
GET index01/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
   { 
       "_id": {
       	   "order":  "asc" #By id ascending desc descending
       }
   },
   ...
  ]
}
#Only 10 messages are displayed by default

Query DSL

match
# 1. match_all query all
GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
   { 
       "balance": {
       	   "order":  "asc" #By id ascending desc descending
       }
   }
  ],
  "from": 5,  # Specify where to start
  "size": 5,  # Specify how many queries to query at a time 
  "_source": ["balance","firstname"] # Specify which fields to return
}
# 2. match is used to find the data of field xxx with value of xxx,
GET /bank/_search
{
  "query": {
    "match": {
    	"account_number": 20
    }
  }
}
# It can also be fuzzy matching. Words are separated by spaces, and "Max" is in descending order of matching degree (number of matching words)_ Score "matches the highest score
GET /bank/_search
{
  "query": {
    "match": {
    	"address": "mill lane"
    }
  }
}

# 3.match_phrase perfect match
GET /bank/_search
{
  "query": {
    "match_phrase": {
    	"address": "Lane mill"
    }
  }
}
# 4.multi_match multiple field match the more query values in the specified field, the more matching
GET /bank/_search
{
  "query": {
    "multi_match": {
    	"query": "Lane Brogan",
    	"fields": ["address","city"]
    }
  }
}
bool
  • must
  • must_not
  • should: it's not necessary, but the score is higher if it matches
# The following conditions filter "gender" as "M", "address" contains "mill", but the age is not 28, and it is better to have Holland's data in last
GET /bank/_search
{
  "query":{
    "bool": {
      "must": [
        {"match": {
          "gender": "M"
        }},
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {"match": {
          "age": "28"
        }}
      ],
      "should": [
        {"match": {
          "lastname": "Holland"
        }}
      ]
    }
  }
}
filter
GET /bank/_search
{
  "query":{
    "bool": {
      "filter": {  #No effect on score
        "range": {
          "age": {
            "gte": 10,
            "lte": 40
          }
        }
      }
    }
  }
}

term

Non text field search, match is suitable for full-text search field

GET /bank/_search
{
  "query": {
    "term": {
    	"balance": "32838"
    }
  }
}
Field. keyword and match distinguish

Must match exactly, space and case

GET /bank/_search
{
  "query": {
    "match": {
    	"address.keyword": "789 Madison Street"
    }
  }
}
#Match in the following example_ Phrase, as long as the complete phrase (complete word A, complete word B) exists (case independent, but the word order cannot be disordered)
GET /bank/_search
{
  "query": {
    "match_phrase": {
    	"address": "789 madison"
    }
  }
}
#In the following example, as long as any one of the words exists in match, it has nothing to do with word order, spelling and case. The more words match, the higher the score
GET /bank/_search
{
  "query": {
    "match": {
    	"address": "789 madison"
    }
  }
}

# Note: the above matching is to match the whole word, and each word can only be matched if its length is the same

Aggregations

GET bank/_search
{
  "query": {
    "match_all": {}
  },
 "aggs": {
   "myagg1": {
      "terms": {
      "field": "age"	#View age distribution
      }
   },
   "myaggAVG":{
      "avg": {
        "field": "balance" #View average balance
      }
   }
 },
 "size": 0
}

GET bank/_search
{
  "query": {
    "match_all": {}
  },
 "aggs": {
   "myagg1": {
      "terms": {
        "field": "age"
      },
      "aggs": {
        "myaggAVG": {	# Sub aggregation is to view the distribution of each age and the average balance of each age
          "avg": {
            "field": "balance"
          }
        }
      }
   }
 },
 "size": 0 #Do not display the found results, only the aggregate results
}

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "myagg": {
      "terms": {
        "field": "age",
        "size": 10
      },
      "aggs": {
        "myaggGender": {
          "terms": {
            "field": "gender.keyword",
            "size": 10
          },
          "aggs": {
            "myaggGenderAVG": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "myaggAllAVG":{
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }, 
 "size": 0
}
# Check the average balance of men and women in each age group and the average balance of this age group

Mapping

Create a mapping relationship

PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {"type": "integer"},
      "email":{"type": "keyword"},
      "name": {"type": "text","index":true} #index is true by default and can be indexed by default
    }
  }
}

View mapping information

GET request: index/_mapping

Modify mapping information

Add new field

PUT /my_index/_mapping
{
  "properties":{
    "address":{
      "type":"keyword",
      "index":false
    }
  }
}

Note: and the existing mapping information cannot be modified

If you want to modify, you can only specify the correct mapping relationship, and then data migration

PUT /new_bank  #Create a new map
{
  "mappings": {
    "properties": {
      "account_number": {
        "type": "long"
      },
      "address": {
        "type": "text"
      },
      "age": {
        "type": "integer"
      },
      "balance": {
        "type": "long"
      },
      "city": {
        "type": "keyword"
      },
      "email": {
        "type": "keyword"
      },
      "employer": {
        "type": "keyword"
      },
      "firstname": {
        "type": "text"
      },
      "gender": {
        "type": "keyword"
      },
      "lastname": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "state": {
        "type": "keyword"
      }
    }
  }
}


POST _reindex	#Data migration, no longer using type
{
  "source": {
    "index": "bank",
    "type": "account" #The original data has the type, and the type will not be used after 6.0
  },
  "dest": {
    "index": "new_bank" #No type
  }
}

The original type is suitable for definition

After not using type, all types are_ doc

participle

Install plug-ins

Download elastic search-analysis-ik-7.4.2

Unzip it and put it in the mounted plugins directory, or put it in the container

Check if the plug-in is installed successfully

docker exec -it es /bin/bash

cd bin

elasticsearch-plugin list # List all plug-ins
POST _analyze
{
  "analyzer": "ik_max_word",
  "text": ["I am Chinese,"]
}

POST _analyze
{
  "analyzer": "ik_smart",
  "text": ["I am Chinese,"]
}

Custom Dictionary

Install nginx
# Start any nginx instance
docker run -p 80:80 --name nginx -d nginx:1.10

# Copy out the configuration file in the container (with a space +.)
docker container cp nginx:/etc/nginx .

# Stop and delete this container

# Put the configuration in the conf directory and transfer it to the nginx directory
mv nginx conf
mkdir nginx
mv conf nginx/

# function
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx:1.10

#Test whether the startup is successful
cd html
vi index.html
#write in
<h1>hahahha~~~~~</h1>

#Save exit, access port 80

Create a custom Thesaurus
  1. In the html directory of nginx, create a new ES folder, create a new TXT file, enter words, save and access ip:80/es/xxx.txt

  2. Enter plugins/elasticsearch-analysis-ik-7.4.2/config directory to modify IKAnalyzer.cfg.xml file

    Write the access paths in the extended dictionary

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
    <properties>
            < comment > IK analyzer extended configuration < / comment >
            <! -- users can configure their own extended dictionary here -- >
            <entry key="ext_dict"></entry>
             <! -- users can configure their own extended stop word dictionary here -- >
            <entry key="ext_stopwords"></entry>
            <! -- users can configure remote extended dictionary here -- >
            <entry key="remote_ ext_ Dict "> http: / / your IP / es/ participle.txt </entry>
            <! -- users can configure the remote extension stop word dictionary here -- >
            <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
    </properties>
    
  3. restart

Integrating SpringBoot

Rest Client documentation

  • Create the springboot project, select web instead of elastic search integrated by springboot, because its version is not updated to 7.0
  • Import dependency
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.4.2</version>
</dependency>
  • Adjust the version. springboot 2.2.7 has 6.8.8 built in. Specify the version
<properties>
    <java.version>1.8</java.version>
    <elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
  • to configure
@Configuration
public class ElasticSearchConfig {

    public  static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        /*builder.addHeader("Authorization", "Bearer " + TOKEN);
        builder.setHttpAsyncResponseConsumerFactory(
                new HttpAsyncResponseConsumerFactory
                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));*/
        COMMON_OPTIONS = builder.build();
    }

    @Bean
    public RestHighLevelClient esRestCilent(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("ip...", 9200, "http")));
        return client;
    }
}

Create index

@Test
void index() throws IOException{
    //Create a new index
    IndexRequest indexRequest = new IndexRequest("users");
    //Specify id
    indexRequest.id("1");

    User user = new User();
    user.setUserName("Xiaoming");
    user.setGender("M");
    user.setAge(18);
    // Convert to json character
    String u = JSON.toJSONString(user);
    // Save json
    indexRequest.source(u, XContentType.JSON);

    // Perform action
    IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS);

    System.out.println(index);
}

Output results

IndexResponse[index=users,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]

obtain

@Test
	public void get() throws IOException{
		GetRequest getRequest = new GetRequest(
				"users",
				"1");
		GetResponse fields = client.get(getRequest, ElasticSearchConfig.COMMON_OPTIONS);
		System.out.println(fields);
	}

output

{"_index":"users","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{"age":18,"gender":"M","userName":"Xiaoming"}}

delete

@Test
	public void delete() throws IOException{
		DeleteRequest request = new DeleteRequest(
				"users",
				"1");
		DeleteResponse delete = client.delete(request, ElasticSearchConfig.COMMON_OPTIONS);
		System.out.println(delete);
	}

retrieval

json tools website

Click here

	@Test
	public void search() throws IOException{
		//Create request
		SearchRequest searchRequest = new SearchRequest("new_bank");
		//Encapsulation conditions
		SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
		//Filter address data with mil words
		searchSourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
		//Aggregate by age distribution,
		TermsAggregationBuilder ageTerm = AggregationBuilders.terms("ageTerm").field("age");
		//Average balance
		AvgAggregationBuilder balanceAvg = AggregationBuilders.avg("balanceAvg").field("balance");

		//Add aggregation
		searchSourceBuilder.aggregation(ageTerm);
		searchSourceBuilder.aggregation(balanceAvg);

		System.out.println("request"+searchSourceBuilder);

		//preservation
		searchRequest.source(searchSourceBuilder);

		//implement
		SearchResponse response = client.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);

		System.out.println("Retrieve return information"+response);

		//Analysis return information
		SearchHits hits = response.getHits();
		System.out.println("The search results are as follows:"+hits.getTotalHits());

		SearchHit[] searchHits = hits.getHits();
		for(SearchHit hit :searchHits){
			String string = hit.getSourceAsString();
			Account account = JSON.parseObject(string,Account.class);
			System.out.println("Search results"+account);
		}

		//Analyze aggregation results
		Aggregations aggregations = response.getAggregations();

		Terms term = aggregations.get("ageTerm");
		for (Terms.Bucket bucket : term.getBuckets()) {
			String keyAsString = bucket.getKeyAsString();
			System.out.println("Age:"+keyAsString+"yes:"+bucket.getDocCount()+"individual");
		}

		Avg avg = aggregations.get("balanceAvg");
		System.out.println("The average balance obtained is:"+avg.getValue());

	}

Output results

request
{
	"query": {
		"match": {
			"address": {
				"query": "mill",
				"operator": "OR",
				"prefix_length": 0,
				"max_expansions": 50,
				"fuzzy_transpositions": true,
				"lenient": false,
				"zero_terms_query": "NONE",
				"auto_generate_synonyms_phrase_query": true,
				"boost": 1.0
			}
		}
	},
	"aggregations": {
		"ageTerm": {
			"terms": {
				"field": "age",
				"size": 10,
				"min_doc_count": 1,
				"shard_min_doc_count": 0,
				"show_term_doc_count_error": false,
				"order": [{
					"_count": "desc"
				}, {
					"_key": "asc"
				}]
			}
		},
		"balanceAvg": {
			"avg": {
				"field": "balance"
			}
		}
	}
}
Retrieve return information
{
	"took": 2,
	"timed_out": false,
	"_shards": {
		"total": 1,
		"successful": 1,
		"skipped": 0,
		"failed": 0
	},
	"hits": {
		"total": {
			"value": 4,
			"relation": "eq"
		},
		"max_score": 5.4032025,
		"hits": [{
			"_index": "new_bank",
			"_type": "_doc",
			"_id": "970",
			"_score": 5.4032025,
			"_source": {
				"account_number": 970,
				"balance": 19648,
				"firstname": "Forbes",
				"lastname": "Wallace",
				"age": 28,
				"gender": "M",
				"address": "990 Mill Road",
				"employer": "Pheast",
				"email": "forbeswallace@pheast.com",
				"city": "Lopezo",
				"state": "AK"
			}
		}, ...]
	},
	"aggregations": {
		"avg#balanceAvg": {
			"value": 25208.0
		},
		"lterms#ageTerm": {
			"doc_count_error_upper_bound": 0,
			"sum_other_doc_count": 0,
			"buckets": [{
				"key": 38,
				"doc_count": 2
			}, {
				"key": 28,
				"doc_count": 1
			}, {
				"key": 32,
				"doc_count": 1
			}]
		}
	}
}
The results are as follows: 4 hits
 Retrieve result Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK)
Retrieve result Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL)
Retrieve result Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY)
Retrieve result Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT)
Age: 38, 2
 Age: 28: 1
 Age: 32: 1
 The average balance obtained is: 25208.0

Test with kibana