Skip to content
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Haskell
Branch: master
Clone or download

Latest commit

yuanbing and facebook-github-bot Time/es: Make "n horas" latent". (#478)
Summary:
1. ~~Fixed broken build due to the problem with main test entry point;~~
2. Fixed the ambiguous results caused by mishandling the
ranking rules for parsing frames in ES. For example "una hora"
be interpreted either as "Duration" or "1pm" in "Time" dimension.
And the expected result should be in "Duration" dimension.
3. ~~ignore stack lock file~~
Pull Request resolved: #478

Test Plan:
```
:test Endpoint.Duckling.Tests --hide-successes
[1003 of 1003] Endpoint.Duckling.Tests (Duckling.Api changed)
Ok, two modules loaded.

All 357 tests passed (79.69s)
```

```
haxlsh> H.io $ debug (makeLocale ES Nothing) "de una horas" [This Time, This Duration]
<integer> <unit-of-duration> (una horas)
-- number (0..15) (una)
-- -- regex (una)
-- hora (grain) (horas)
-- -- regex (horas)
[Entity {dim = "duration", body = "una horas", value = RVal Duration (DurationData {value = 1, grain = Hour}), start = 3, end = 12, latent = False, enode = Node {nodeRange = Range 3 12, token = Token Duration (DurationData {value = 1, grain = Hour}), children = [Node {nodeRange = Range 3 6, token = Token Numeral (NumeralData {value = 1.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 3 6, token = Token RegexMatch (GroupMatch ["una","","a","","",""]), children = [], rule = Nothing}], rule = Just "number (0..15)"},Node {nodeRange = Range 7 12, token = Token TimeGrain Hour, children = [Node {nodeRange = Range 7 12, token = Token RegexMatch (GroupMatch ["ora"]), children = [], rule = Nothing}], rule = Just "hora (grain)"}], rule = Just "<integer> <unit-of-duration>"}}]
it :: [Entity]
```

Reviewed By: fascpt

Differential Revision: D21770015

Pulled By: chinmay87

fbshipit-source-id: 3056fcf656140c9d65b70b5c604a286ea2c307b2
Latest commit 1dac46a May 29, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Duckling Time/es: Make "n horas" latent". (#478) May 29, 2020
dist-newstyle/cache Add Numeral dimension for new language TH (#399) Nov 27, 2019
exe AF Setup + Numeral (#422) Jan 10, 2020
tests Time/es: Make "n horas" latent". (#478) May 29, 2020
.dockerignore Improve Docker build (#341) Apr 17, 2020
.gitignore Hindi Language Numeral Dimension(minimalistic model). Tests passed. Dec 19, 2017
.travis.yml Update dependencies to latest version to make duckling compile with g… Feb 22, 2019
CODE_OF_CONDUCT.md add FB code of conduct Jan 2, 2018
CONTRIBUTING.md Documentation: Coding style May 14, 2018
Dockerfile Improve Docker build (#341) Apr 17, 2020
LICENSE Initial commit Mar 8, 2017
README.md Update: add new dimension to a language Jul 9, 2019
duckling.cabal Numeral/EN: Fixes ambiguous parses when both ruleNegative and ruleMul… Mar 13, 2020
logo.png Adding logo Mar 15, 2017
stack.yaml Update to lts-9.10 Oct 27, 2017

README.md

Duckling Logo

Duckling Build Status

Duckling is a Haskell library that parses text into structured data.

"the first Tuesday of October"
=> {"value":"2017-10-03T00:00:00.000-07:00","grain":"day"}

Requirements

A Haskell environment is required. We recommend using stack.

On macOS you'll need to install PCRE development headers. The easiest way to do that is with Homebrew:

brew install pcre

If that doesn't help, try running brew doctor and fix the issues it finds.

Quickstart

To compile and run the binary:

$ stack build
$ stack exec duckling-example-exe

The first time you run it, it will download all required packages.

This runs a basic HTTP server. Example request:

$ curl -XPOST http://0.0.0.0:8000/parse --data 'locale=en_GB&text=tomorrow at eight'

See exe/ExampleMain.hs for an example on how to integrate Duckling in your project. If your backend doesn't run Haskell or if you don't want to spin your own Duckling server, you can directly use wit.ai's built-in entities.

Supported dimensions

Duckling supports many languages, but most don't support all dimensions yet (we need your help!). Please look into this directory for language-specific support.

Dimension Example input Example value output
AmountOfMoney "42€" {"value":42,"type":"value","unit":"EUR"}
CreditCardNumber "4111-1111-1111-1111" {"value":"4111111111111111","issuer":"visa"}
Distance "6 miles" {"value":6,"type":"value","unit":"mile"}
Duration "3 mins" {"value":3,"minute":3,"unit":"minute","normalized":{"value":180,"unit":"second"}}
Email "duckling-team@fb.com" {"value":"duckling-team@fb.com"}
Numeral "eighty eight" {"value":88,"type":"value"}
Ordinal "33rd" {"value":33,"type":"value"}
PhoneNumber "+1 (650) 123-4567" {"value":"(+1) 6501234567"}
Quantity "3 cups of sugar" {"value":3,"type":"value","product":"sugar","unit":"cup"}
Temperature "80F" {"value":80,"type":"value","unit":"fahrenheit"}
Time "today at 9am" {"values":[{"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"}],"value":"2016-12-14T09:00:00.000-08:00","grain":"hour","type":"value"}
Url "https://api.wit.ai/message?q=hi" {"value":"https://api.wit.ai/message?q=hi","domain":"api.wit.ai"}
Volume "4 gallons" {"value":4,"type":"value","unit":"gallon"}

Custom dimensions are also supported.

Extending Duckling

To regenerate the classifiers and run the test suite:

$ stack build :duckling-regen-exe && stack exec duckling-regen-exe && stack test

It's important to regenerate the classifiers after updating the code and before running the test suite.

To extend Duckling's support for a dimension in a given language, typically 4 files need to be updated:

  • Duckling/<Dimension>/<Lang>/Rules.hs
  • Duckling/<Dimension>/<Lang>/Corpus.hs
  • Duckling/Dimensions/<Lang>.hs (if not already present in Duckling/Dimensions/Common.hs)
  • Duckling/Rules/<Lang>.hs

To add a new language:

To add a new locale:

Rules have a name, a pattern and a production. Patterns are used to perform character-level matching (regexes on input) and concept-level matching (predicates on tokens). Productions are arbitrary functions that take a list of tokens and return a new token.

The corpus (resp. negative corpus) is a list of examples that should (resp. shouldn't) parse. The reference time for the corpus is Tuesday Feb 12, 2013 at 4:30am.

Duckling.Debug provides a few debugging tools:

$ stack repl --no-load
> :l Duckling.Debug
> debug (makeLocale EN $ Just US) "in two minutes" [This Time]
in|within|after <duration> (in two minutes)
-- regex (in)
-- <integer> <unit-of-duration> (two minutes)
-- -- integer (0..19) (two)
-- -- -- regex (two)
-- -- minute (grain) (minutes)
-- -- -- regex (minutes)
[Entity {dim = "time", body = "in two minutes", value = RVal Time (TimeValue (SimpleValue (InstantValue {vValue = 2013-02-12 04:32:00 -0200, vGrain = Second})) [SimpleValue (InstantValue {vValue = 2013-02-12 04:32:00 -0200, vGrain = Second})] Nothing), start = 0, end = 14}]

License

Duckling is BSD-licensed.

You can’t perform that action at this time.