To perform a search operation on a specific index:
GET /my_index/_search
By itself (without a request body), it returns the first 10 documents by default. This request is the same as the above one:
GET /my_index/_search
{
"query": {
"match_all": {}
}
}
In Kibana's Dev Tools, the query parameter in a GET request refers to the search query that defines which documents we want to retrieve from Elasticsearch. It's part of the request body and specifies the search criteria. The query parameter essentially tells Elasticsearch "find me documents that match these conditions." It's the core part of any search request and determines which documents from our index will be returned in the response.
The query object can contain various types of queries. Common query types:
match_all - Returns all documents:
{
"query": {
"match_all": {}
}
}
match - Full-text search on a specific field:
{
"query": {
"match": {
"field_name": "search_term"
}
}
}
term - Exact term matching:
{
"query": {
"term": {
"status": "active"
}
}
}
bool - Combine multiple queries with logical operators:
{
"query": {
"bool": {
"must": [
{"match": {"title": "elasticsearch"}},
{"range": {"date": {"gte": "2023-01-01"}}}
]
}
}
}
range - Query for values within a range:
{
"query": {
"range": {
"age": {
"gte": 18,
"lte": 65
}
}
}
}
To get the number of documents in an Elasticsearch index, you can use the _count API or the _stats API.
GET /my_index/_count
This will return a response like:
{
"count": 12345,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
}
}
To get a certain number of documents, use size argument:
GET my_index/_search?size=900
We can also use _cat API:
GET /_cat/count/my_index?v
This will return output like:
epoch timestamp count
1718012345 10:32:25 12345
GET /my_index/_stats
"indices": {
"my_index": {
"primaries": {
"docs": {
"count": 12345,
"deleted": 12
}
}
}
}
To get the union of all values of some field e.g. channel_type field across all documents in the my_index index, we can use an Elasticsearch terms aggregation:
GET my_index/_search
{
"size": 0,
"aggs": {
"unique_channel_types": {
"terms": {
"field": "channel_type.keyword",
"size": 10000 // increase if you expect many unique values
}
}
}
}
Explanation:
- "size": 0: No documents returned, just aggregation results.
- "terms": Collects unique values.
- "channel_type.keyword": Use .keyword to aggregate on the raw value (not analyzed text).
- "size": 10000: Max number of buckets (unique values) to return. Adjust as needed.
Response example:
{
"aggregations": {
"unique_channel_types": {
"buckets": [
{ "key": "email", "doc_count": 456 },
{ "key": "push", "doc_count": 321 },
{ "key": "sms", "doc_count": 123 }
]
}
}
}
The "key" values in the buckets array are your union of channel_type values.
Let's assume that my_index has the timestamp field (as the root field...but it can be at any path in which case we'd need to adjust the query) is correctly mapped as a date type.
To get the oldest document:
GET my_index/_search
{
"size": 1,
"sort": [
{ "@timestamp": "asc" }
]
}
To get the newest document:
GET my_index/_search
{
"size": 1,
"sort": [
{ "@timestamp": "desc" }
]
}
How to get all possible values of some field in all documents added to index in last 24 hours?
We can use Terms Aggregation with Range Query:
GET /my_index/_search
{
"size": 0,
"query": {
"range": {
"@timestamp": {
"gte": "now-24h/h",
"lte": "now"
}
}
},
"aggs": {
"unique_values": {
"terms": {
"field": "my_field.keyword",
"size": 10000
}
}
}
}
----
No comments:
Post a Comment