How to map JSON document fields using Index Mapping
Here in this article we will see how we can make use of Index Mappings to create Customized Mappings of fields which is also known as Explicit Mapping. We will creating an index with custom index field mapping setting and index the documents with this new index. Finally, we will be validating the difference in the field types using the Kibana console.
Test Environment
Fedora workstation 37
Docker
Docker Compose
What is Mapping
Every document that we index consist of data. This data is populated within fields in a document. A document can contain data in the form of text, numbers, date or geospatial location. Mapping is a process in which each of these fields is mapped to the corresponding field type. Mapping is the process of defining how a document, and the fields it contains, are stored and indexed.
There are two ways mapping can be carried out in Elasticsearch.
Mapping | Description |
Dynamic Mapping | In this mapping type, documents are indexed with a default mapping definition consisting of default types which are applied as per the document data. The automatic detection and addition of new fields is called dynamic mapping |
Explicit Mapping | In this mapping type, fields types are explicitly defined in mapping definition to apply for the documents that are being indexed. Greater control on the fields types is present |
An index can contain both explicitly mapped and dynamically mapped fields types.
If you are interested in watching the video. Here is the YouTube video on the same step by step procedure outlined below.
Procedure
Step1: Ensure JSON data sets indexed with Dynamic Mapping
This article is in continuation to my previous article “How to index json dataset in Elasticsearch” wherein we saw how we can take a json data set and index it by using Elasticsearch. Elasticsearch takes care of mapping the fields based on the json fields data directly using the dynamic mapping settings which is enabled by default in elasticsearch cluster setup.
Step2: Get the Index Mapping details
Let’s use the following API request to get the current “prizes” index mapping details. You can run this API request using the Dev Tools provided in the Kibana console. As you can see from the response all the field types are of “text” field.
Request
GET prizes/_mapping
Response
{
"prizes": {
"mappings": {
"properties": {
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"laureates": {
"properties": {
"firstname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"motivation": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"share": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"surname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"overallMotivation": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"year": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
Step3: Create an Index with Explicit Mapping
Here in this step we are going to create an index named “prizes_explicit_mapping” with explicit mapping of the fields. As per the below JSON data request you can see we have replaced the field type for “id” and “share” to “integer” type and for “year” field we replaced it with “date” type.
Request
PUT prizes_explicit_mapping
{
"mappings": {
"properties": {
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"laureates": {
"properties": {
"firstname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "integer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"motivation": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"share": {
"type": "integer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"surname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"overallMotivation": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"year": {
"type": "date",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Step4: Index Documents using new Index with Explicit Mappings
Now we are going to run our script again with a new index named “prizes_explicit_mapping” to create an index with explicit mapping set for the indexed documents. Refer to “How to index json dataset in Elasticsearch” for the script details.
python indexDocuments.py
Step5: Validate the Indexed Documents
Once our indexing is done we can navigate to Stack Management – Data View and Create our Data View based on the new index source that got generated. Here as you can see there is a prizes_explicit_mapping index created with each document indexed within that index as per the mapping definition that was provided.
NOTE: Change the filter to last 2000 years as shown below to view the data
In this way we have some explicit control on the document field types that are getting indexed in elasticsearch with explicit mapping.
Hope you enjoyed reading this article. Thank you..
Leave a Reply
You must be logged in to post a comment.