How to map JSON document fields using Index Mapping

How to map JSON document fields using Index Mapping

Elasticsearch_index_mapping

Here in this article we will see how we can make use of Index Mappings to create Customized Mappings of fields which is also known as Explicit Mapping. We will creating an index with custom index field mapping setting and index the documents with this new index. Finally, we will be validating the difference in the field types using the Kibana console.

Test Environment

Fedora workstation 37
Docker
Docker Compose

What is Mapping

Every document that we index consist of data. This data is populated within fields in a document. A document can contain data in the form of text, numbers, date or geospatial location. Mapping is a process in which each of these fields is mapped to the corresponding field type. Mapping is the process of defining how a document, and the fields it contains, are stored and indexed.

There are two ways mapping can be carried out in Elasticsearch.

MappingDescription
Dynamic MappingIn this mapping type, documents are indexed with a default mapping definition consisting of default types which are applied as per the document data. The automatic detection and addition of new fields is called dynamic mapping
Explicit MappingIn this mapping type, fields types are explicitly defined in mapping defintion to apply for the documents that are being indexed. Greater control on the fields types is present

An index can contain both explicitly mapped and dynamically mapped fields types.

If you are interested in watching the video. Here is the YouTube video on the same step by step procedure outlined below.

Procedure

Step1: Ensure JSON data sets indexed with Dynamic Mapping

This article is in continuation to my previous article “How to index json dataset in Elasticsearch” wherein we saw how we can take a json data set and index it by using Elasticsearch. Elastichsearch takes care of mapping the fields based on the json fields data directly using the dynamic mapping settings which is enabled by default in elastichsearch cluster setup.

Step2: Get the Index Mapping details

Let’s use the following API request to get the current “prizes” index mapping details. You can run this API request using the Dev Tools provided in the Kibana console. As you can see from the response all the field types are of “text” field.

Request

GET prizes/_mapping

Response

{
  "prizes": {
    "mappings": {
      "properties": {
        "category": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "laureates": {
          "properties": {
            "firstname": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "id": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "motivation": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "share": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "surname": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },
        "overallMotivation": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "year": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

Step3: Create an Index with Explicit Mapping

Here in this step we are going to create an index named “prizes_explicit_mapping” with explicit mapping of the fields. As per the below JSON data request you can see we have replaced the field type for “id” and “share” to “integer” type and for “year” field we replaced it with “date” type.

Request

PUT prizes_explicit_mapping
{
    "mappings": {
      "properties": {
        "category": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "laureates": {
          "properties": {
            "firstname": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "id": {
              "type": "integer",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "motivation": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "share": {
              "type": "integer",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "surname": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },
        "overallMotivation": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "year": {
          "type": "date",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
}

Step4: Index Documents using new Index with Explicit Mappings

Now we are going to run our script again with a new index named “prizes_explicit_mapping” to create an index with explicit mapping set for the indexed documents. Refer to “How to index json dataset in Elasticsearch” for the script details.

[admin@fedser elastic-kibana]$ python indexDocuments.py 

Step5: Validate the Indexed Documents

Once our indexing is done we can navigate to Stack Management – Data View and Create our Data View based on the new index source that got generated. Here as you can see there is a prizes_explicit_mapping index created with each document indexed within that index as per the mapping definition that was provided.

NOTE: Change the filter to last 2000 years as shown below to view the data

In this way we have some explicit control on the document field types that are getting indexed in elasticsearch with explicit mapping.

Hope you enjoyed reading this article. Thank you..