Skip to content

The ATIS (Airline Travel Information System) Dataset

Notifications You must be signed in to change notification settings

xiaotingxuan/ATIS_dataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The ATIS (Airline Travel Information System) Dataset

本仓库包含了 Python pickle 格式和 Rasa NLU JSON 格式(https://rasa.com/docs/nlu/dataformat/#json-format)的 ATIS Dataset(数据集),并提供了读取脚本和示例代码。

数据样本

原始格式

   0:         flight: BOS i want to fly from boston at 838 am and arrive in denver at 1110 in the morning EOS
                              BOS                                        O
                                i                                        O
                             want                                        O
                               to                                        O
                              fly                                        O
                             from                                        O
                           boston                      B-fromloc.city_name
                               at                                        O
                              838                       B-depart_time.time
                               am                       I-depart_time.time
                              and                                        O
                           arrive                                        O
                               in                                        O
                           denver                        B-toloc.city_name
                               at                                        O
                             1110                       B-arrive_time.time
                               in                                        O
                              the                                        O
                          morning              B-arrive_time.period_of_day
                              EOS                                        O

Rasa NLU Json 格式

{
    "rasa_nlu_data": {
        "common_examples": [
            {
                "text": "i would like to find a flight from charlotte to las vegas that makes a stop in st. louis",
                "intent": "flight",
                "entities": [
                    {
                        "start": 35,
                        "end": 44,
                        "value": "charlotte",
                        "entity": "fromloc.city_name"
                    },
                    {
                        "start": 48,
                        "end": 57,
                        "value": "las vegas",
                        "entity": "toloc.city_name"
                    },
                    {
                        "start": 79,
                        "end": 88,
                        "value": "st. louis",
                        "entity": "stoploc.city_name"
                    }
                ]
            },
            ...
        ]
    }
}

数据统计

样本数 词汇数 实体数 意图数
4978(训练集)+893(测试集) 943 129 26

示例代码

summary_data.py 中包含了读取原始数据的代码,用户可以参考该代码,实现从原始文件读取数据。

下载

数据格式 训练集 测试集
Python 3 Pickle 格式 atis.train.pkl atis.test.pkl
Rasa NLU JSON 格式 train.json test.json

Credit

同类项目

About

The ATIS (Airline Travel Information System) Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%