Merge remote-tracking branch 'upstream/master'

This commit is contained in:
kengz 2016-12-30 12:19:59 -05:00
commit 73a38bd4d1
506 changed files with 20247 additions and 3083769 deletions

106
.github/CONTRIBUTOR_AGREEMENT.md vendored Normal file
View File

@ -0,0 +1,106 @@
# spaCy contributor agreement
This spaCy Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
**"you"** shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested
below and include the filled-in version with your first pull request, under the
folder [`.github/contributors/`](/.github/contributors/). The name of the file
should be your GitHub username, with the extension `.md`. For example, the user
example_user would create the file `.github/contributors/example_user.md`.
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. We may publicly disclose your
participation in the project, including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [ ] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | |
| Company name (if applicable) | |
| Title or role (if applicable) | |
| Date | |
| GitHub username | |
| Website (optional) | |

14
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,14 @@
<!--- Please provide a summary in the title and describe your issue here.
Is this a bug or feature request? If a bug, include all the steps that led to the issue.
If you're looking for help with your code, consider posting a question on StackOverflow instead:
http://stackoverflow.com/questions/tagged/spacy -->
## Your Environment
<!-- Include details of your environment -->
* **Operating System:**
* **Python Version Used:**
* **spaCy Version Used:**
* **Environment Information:**

30
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,30 @@
<!--- Provide a general summary of your changes in the Title -->
## Description
<!--- Describe your changes -->
## Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
<!--- If fixing an open issue, please link to the issue here. -->
## How Has This Been Tested?
<!--- Please describe in detail your tests. Did you add new tests? -->
<!--- Include details of your testing environment, and the tests you ran too -->
<!--- How were other areas of the code affected? -->
## Screenshots (if appropriate):
## Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all applicable boxes.: -->
- [ ] Bug fix (non-breaking change fixing an issue)
- [ ] New feature (non-breaking change adding functionality to spaCy)
- [ ] Breaking change (fix or feature causing change to spaCy's existing functionality)
- [ ] Documentation (Addition to documentation of spaCy)
## Checklist:
<!--- Go over all the following points, and put an `x` in all applicable boxes.: -->
- [ ] My code follows spaCy's code style.
- [ ] My change requires a change to spaCy's documentation.
- [ ] I have updated the documentation accordingly.
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.

98
.github/contributors/NSchrading.md vendored Normal file
View File

@ -0,0 +1,98 @@
# Syllogism contributor agreement
This Syllogism Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
Syllogism Co. The term **"you"** shall mean the person or entity identified
below.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. Syllogism Co. may publicly
disclose your participation in the project, including the fact that you have
signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [x] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | J Nicolas Schrading |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-08-24 |
| GitHub username | NSchrading |
| Website (optional) | nicschrading.com |

107
.github/contributors/RvanNieuwpoort.md vendored Executable file
View File

@ -0,0 +1,107 @@
# spaCy contributor agreement
This spaCy Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
**"you"** shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested
below and include the filled-in version with your first pull request, under the
folder [`.github/contributors/`](/.github/contributors/). The name of the file
should be your GitHub username, with the extension `.md`. For example, the user
example_user would create the file `.github/contributors/example_user.md`.
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. We may publicly disclose your
participation in the project, including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [ ] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [x] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------------------- |
| Name | Rob van Nieuwpoort |
| Signing on behalf of | Dafne van Kuppevelt, Janneke van der Zwaan, Willem van Hage |
| Company name (if applicable) | Netherlands eScience center |
| Title or role (if applicable) | Director of technology |
| Date | 14-12-2016 |
| GitHub username | RvanNieuwpoort |
| Website (optional) | https://www.esciencecenter.nl/ |

98
.github/contributors/chrisdubois.md vendored Normal file
View File

@ -0,0 +1,98 @@
# Syllogism contributor agreement
This Syllogism Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
Syllogism Co. The term **"you"** shall mean the person or entity identified
below.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. Syllogism Co. may publicly
disclose your participation in the project, including the fact that you have
signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [x] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Chris DuBois |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015.10.07 |
| GitHub username | chrisdubois |
| Website (optional) | |

106
.github/contributors/magnusburton.md vendored Normal file
View File

@ -0,0 +1,106 @@
# spaCy contributor agreement
This spaCy Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
**"you"** shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested
below and include the filled-in version with your first pull request, under the
folder [`.github/contributors/`](/.github/contributors/). The name of the file
should be your GitHub username, with the extension `.md`. For example, the user
example_user would create the file `.github/contributors/example_user.md`.
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. We may publicly disclose your
participation in the project, including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [x] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------------------- |
| Name | Magnus Burton |
| Company name (if applicable) | |
| Title or role (if applicable) | |
| Date | 17-12-2016 |
| GitHub username | magnusburton |
| Website (optional) | |

106
.github/contributors/oroszgy.md vendored Normal file
View File

@ -0,0 +1,106 @@
# spaCy contributor agreement
This spaCy Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
[ExplosionAI UG (haftungsbeschränkt)](https://explosion.ai/legal). The term
**"you"** shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested
below and include the filled-in version with your first pull request, under the
folder [`.github/contributors/`](/.github/contributors/). The name of the file
should be your GitHub username, with the extension `.md`. For example, the user
example_user would create the file `.github/contributors/example_user.md`.
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. We may publicly disclose your
participation in the project, including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [X] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | György Orosz |
| Company name (if applicable) | |
| Title or role (if applicable) | |
| Date | 2016-12-26 |
| GitHub username | oroszgy |
| Website (optional) | gyorgy.orosz.link |

98
.github/contributors/suchow.md vendored Normal file
View File

@ -0,0 +1,98 @@
# Syllogism contributor agreement
This Syllogism Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
Syllogism Co. The term **"you"** shall mean the person or entity identified
below.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. Syllogism Co. may publicly
disclose your participation in the project, including the fact that you have
signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [x] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Jordan Suchow |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-04-19 |
| GitHub username | suchow |
| Website (optional) | http://suchow.io |

98
.github/contributors/vsolovyov.md vendored Normal file
View File

@ -0,0 +1,98 @@
# Syllogism contributor agreement
This Syllogism Contributor Agreement (**"SCA"**) is based on the
[Oracle Contributor Agreement](http://www.oracle.com/technetwork/oca-405177.pdf).
The SCA applies to any contribution that you make to any product or project
managed by us (the **"project"**), and sets out the intellectual property rights
you grant to us in the contributed materials. The term **"us"** shall mean
Syllogism Co. The term **"you"** shall mean the person or entity identified
below.
## Contributor Agreement
1. The term "contribution" or "contributed materials" means any source code,
object code, patch, tool, sample, graphic, specification, manual,
documentation, or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and
registrations, in your contribution:
* you hereby assign to us joint ownership, and to the extent that such
assignment is or becomes invalid, ineffective or unenforceable, you hereby
grant to us a perpetual, irrevocable, non-exclusive, worldwide, no-charge,
royalty-free, unrestricted license to exercise all rights under those
copyrights. This includes, at our option, the right to sublicense these same
rights to third parties through multiple levels of sublicensees or other
licensing arrangements;
* you agree that each of us can do all things in relation to your
contribution as if each of us were the sole owners, and if one of us makes
a derivative work of your contribution, the one who makes the derivative
work (or has it made will be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution
against us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and
exercise all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the
consent of, pay or render an accounting to the other for any use or
distribution of your contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable,
non-exclusive, worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer
your contribution in whole or in part, alone or in combination with or
included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through
multiple levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective
on the date you first submitted a contribution to us, even if your submission
took place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of
authorship and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any
third party's copyrights, trademarks, patents, or other intellectual
property rights; and
* each contribution shall be in compliance with U.S. export control laws and
other applicable export and import laws. You agree to notify us if you
become aware of any circumstance which would make any of the foregoing
representations inaccurate in any respect. Syllogism Co. may publicly
disclose your participation in the project, including the fact that you have
signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable
U.S. Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
* [x] I am signing on behalf of myself as an individual and no other person
or entity, including my employer, has or will have rights with respect my
contributions.
* [ ] I am signing on behalf of my employer or a legal entity and I have the
actual authority to contractually bind that entity.
## Contributor Details
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Vsevolod Solovyov |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-08-24 |
| GitHub username | vsolovyov |
| Website (optional) | |

20
.gitignore vendored
View File

@ -9,12 +9,12 @@ tmp/
.eggs .eggs
*.tgz *.tgz
.sass-cache .sass-cache
.python-version
MANIFEST MANIFEST
corpora/ corpora/
models/ models/
examples/
keys/ keys/
spacy/syntax/*.cpp spacy/syntax/*.cpp
@ -29,13 +29,12 @@ spacy/orthography/*.cpp
ext/murmurhash.cpp ext/murmurhash.cpp
ext/sparsehash.cpp ext/sparsehash.cpp
data/en/pos /spacy/data/
data/en/ner
data/en/lexemes
data/en/strings
_build/ _build/
.env/ .env/
tmp/
cythonize.json
# Byte-compiled / optimized / DLL files # Byte-compiled / optimized / DLL files
__pycache__/ __pycache__/
@ -73,11 +72,6 @@ htmlcov/
nosetests.xml nosetests.xml
coverage.xml coverage.xml
# Website
website/www/
website/demos/displacy/
website/demos/sense2vec/
# Translations # Translations
*.mo *.mo
@ -99,9 +93,15 @@ website/demos/sense2vec/
# Mac OS X # Mac OS X
*.DS_Store *.DS_Store
# Temporary files / Dropbox hack
*.~*
# Komodo project files # Komodo project files
*.komodoproject *.komodoproject
# Website # Website
website/_deploy.sh website/_deploy.sh
website/package.json website/package.json
website/blog/announcement.jade
website/www/
website/.gitignore

View File

@ -1,33 +1,29 @@
language: python language: python
sudo: required sudo: false
dist: trusty dist: trusty
group: edge group: edge
python: python:
- "2.7" - "2.7"
- "3.4" - "3.5"
os: os:
- linux - linux
env:
- VIA=compile LC_ALL=en_US.ascii
- VIA=compile
- VIA=sdist
install: install:
- "pip install -r requirements.txt" - "./travis.sh"
- "pip install -e ."
- "mkdir -p corpora/en"
- "cd corpora/en"
- "wget --no-check-certificate http://wordnetcode.princeton.edu/3.0/WordNet-3.0.tar.gz"
- "tar -xzf WordNet-3.0.tar.gz"
- "mv WordNet-3.0 wordnet"
- "cd ../../"
- "python bin/init_model.py en lang_data/ corpora/ data"
- "cp package.json data"
- "sputnik build data en_default.sputnik"
- "sputnik --name spacy install en_default.sputnik"
script: script:
- "pip install pytest" - "pip install pytest"
- "python -m pytest spacy" - if [[ "${VIA}" == "compile" ]]; then python -m pytest spacy; fi
- if [[ "${VIA}" == "pypi" ]]; then python -m pytest `python -c "import pathlib; import spacy; print(pathlib.Path(spacy.__file__).parent.resolve())"`; fi
- if [[ "${VIA}" == "sdist" ]]; then python -m pytest `python -c "import pathlib; import spacy; print(pathlib.Path(spacy.__file__).parent.resolve())"`; fi
notifications: notifications:
slack: slack:

152
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,152 @@
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
# Contribute to spaCy
Following the v1.0 release, it's time to welcome more contributors into the spaCy project and code base 🎉 This page will give you a quick overview of how things are organised and most importantly, how to get involved.
## Table of contents
1. [Issues and bug reports](#issues-and-bug-reports)
2. [Contributing to the code base](#contributing-to-the-code-base)
3. [Updating the website](#updating-the-website)
4. [Submitting a tutorial](#submitting-a-tutorial)
5. [Submitting a project to the showcase](#submitting-a-project-to-the-showcase)
6. [Code of conduct](#code-of-conduct)
## Issues and bug reports
First, [do a quick search](https://github.com/issues?q=+is%3Aissue+user%3Aexplosion) to see if the issue has already been reported. If so, it's often better to just leave a comment on an existing issue, rather than creating a new one.
If you're looking for help with your code, consider posting a question on [StackOverflow](http://stackoverflow.com/questions/tagged/spacy) instead. If you tag it `spacy` and `python`, more people will see it and hopefully be able to help.
When opening an issue, use a descriptive title and include your environment (operating system, Python version, spaCy version). Our [issue template](https://github.com/explosion/spaCy/issues/new) helps you remember the most important details to include.
If you've discovered a bug, you can also submit a [regression test](#fixing-bugs) straight away. When you're opening an issue to report the bug, simply refer to your pull request in the issue body.
### Issue labels
We use the following system to tag our issues:
| Issue label | Description |
| --- | --- |
| [`bug`](https://github.com/explosion/spaCy/labels/bug) | Bugs and behaviour differing from documentation |
| [`enhancement`](https://github.com/explosion/spaCy/labels/enhancement) | Feature requests and improvements |
| [`install`](https://github.com/explosion/spaCy/labels/install) | Installation problems |
| [`performance`](https://github.com/explosion/spaCy/labels/performance) | Accuracy, speed and memory use problems |
| [`tests`](https://github.com/explosion/spaCy/labels/tests) | Missing or incorrect [tests](spacy/tests) |
| [`linux`](https://github.com/explosion/spaCy/labels/linux), [`osx`](https://github.com/explosion/spaCy/labels/osx), [`windows`](https://github.com/explosion/spaCy/labels/windows) | Issues related to the specific operating systems |
| [`pip`](https://github.com/explosion/spaCy/labels/pip), [`conda`](https://github.com/explosion/spaCy/labels/conda) | Issues related to the specific package managers |
| [`duplicate`](https://github.com/explosion/spaCy/labels/duplicate) | Duplicates, i.e. issues that have been reported before |
| [`help wanted`](https://github.com/explosion/spaCy/labels/help%20wanted) | Requests for contributions |
## Contributing to the code base
Coming soon.
### Conventions for Python
Coming soon.
### Conventions for Cython
Coming soon.
### Developer resources
The [spaCy developer resources](https://github.com/explosion/spacy-dev-resources) repo contains useful scripts, tools and templates for developing spaCy, adding new languages and training new models. If you've written a script that might help others, feel free to contribute it to that repository.
### Contributor agreement
If you've made a substantial contribution to spaCy, you should fill in the [spaCy contributor agreement](.github/CONTRIBUTOR_AGREEMENT.md) to ensure that your contribution can be used across the project. If you agree to be bound by the terms of the agreement, fill in the [template]((.github/CONTRIBUTOR_AGREEMENT.md)) and include it with your pull request, or sumit it separately to [`.github/contributors/`](/.github/contributors). The name of the file should be your GitHub username, with the extension `.md`. For example, the user
example_user would create the file `.github/contributors/example_user.md`.
### Fixing bugs
When fixing a bug, first create an [issue](https://github.com/explosion/spaCy/issues) if one does not already exist. The description text can be very short we don't want to make this too bureaucratic. Next, create a test file named `test_issue[ISSUE NUMBER].py` in the [`spacy/tests/regression`](spacy/tests/regression) folder.
Test for the bug you're fixing, and make sure the test fails. If the test requires the models to be loaded, mark it with the `pytest.mark.models` decorator.
Next, add and commit your test file referencing the issue number in the commit message. Finally, fix the bug, make sure your test passes and reference the issue in your commit message.
## Updating the website
Our [website and docs](https://spacy.io) are implemented in [Jade/Pug](https://www.jade-lang.org), and built or served by [Harp](https://harpjs.com). Jade/Pug is an extensible templating language with a readable syntax, that compiles to HTML. Here's how to view the site locally:
```bash
sudo npm install --global harp
git clone https://github.com/explosion/spaCy
cd website
harp server
```
The docs can always use another example or more detail, and they should always be up to date and not misleading. To quickly find the correct file to edit, simply click on the "Suggest edits" button at the bottom of a page.
To make it easy to add content components, we use a [collection of custom mixins](_includes/_mixins.jade), like `+table`, `+list` or `+code`. For more info and troubleshooting guides, check out the [website README](website).
### Resources to get you started
* [Guide to static websites with Harp and Jade](https://ines.io/blog/the-ultimate-guide-static-websites-harp-jade) (ines.io)
* [Building a website with modular markup components (mixins)](https://explosion.ai/blog/modular-markup) (explosion.ai)
* [Jade/Pug documentation](https://pugjs.org) (pugjs.org)
* [Harp documentation](https://harpjs.com/) (harpjs.com)
## Submitting a tutorial
Did you write a [tutorial](https://spacy.io/docs/usage/tutorials) to help others use spaCy, or did you come across one that should be added to our directory? You can submit it by making a pull request to [`website/docs/usage/_data.json`](website/docs/usage/_data.json):
```json
{
"tutorials": {
"deep_dives": {
"Deep Learning with custom pipelines and Keras": {
"url": "https://explosion.ai/blog/spacy-deep-learning-keras",
"author": "Matthew Honnibal",
"tags": [ "keras", "sentiment" ]
}
}
}
}
```
### A few tips
* A suitable tutorial should provide additional content and practical examples that are not covered as such in the docs.
* Make sure to choose the right category `first_steps`, `deep_dives` (tutorials that take a deeper look at specific features) or `code` (programs and scripts on GitHub etc.).
* Don't go overboard with the tags. Take inspirations from the existing ones and only add tags for features (`"sentiment"`, `"pos"`) or integrations (`"jupyter"`, `"keras"`).
* Double-check the JSON markup and/or use a linter. A wrong or missing comma will (unfortunately) break the site rendering.
## Submitting a project to the showcase
Have you built a library, visualizer, demo or product with spaCy, or did you come across one that should be featured in our [showcase](https://spacy.io/docs/usage/showcase)? You can submit it by making a pull request to [`website/docs/usage/_data.json`](website/docs/usage/_data.json):
```json
{
"showcase": {
"visualizations": {
"displaCy": {
"url": "https://demos.explosion.ai/displacy",
"author": "Ines Montani",
"description": "An open-source NLP visualiser for the modern web",
"image": "displacy.jpg"
}
}
}
}
```
### A few tips
* A suitable third-party library should add substantial functionality, be well-documented and open-source. If it's just a code snippet or script, consider submitting it to the `code` category of the tutorials section instead.
* A suitable demo should be hosted and accessible online. Open-source code is always a plus.
* For visualizations and products, add an image that clearly shows how it looks screenshots are ideal.
* The image should be resized to 300x188px, optimised using a tool like [ImageOptim](https://imageoptim.com/mac) and added to [`website/assets/img/showcase`](website/assets/img/showcase).
* Double-check the JSON markup and/or use a linter. A wrong or missing comma will (unfortunately) break the site rendering.
## Code of conduct
spaCy adheres to the [Contributor Covenant Code of Conduct](http://contributor-covenant.org/version/1/4/). By participating, you are expected to uphold this code.

36
CONTRIBUTORS.md Normal file
View File

@ -0,0 +1,36 @@
# 👥 Contributors
This is a list of everyone who has made significant contributions to spaCy, in alphabetical order. Thanks a lot for the great work!
* Adam Bittlingmayer, [@bittlingmayer](https://github.com/bittlingmayer)
* Andreas Grivas, [@andreasgrv](https://github.com/andreasgrv)
* Bhargav Srinivasa, [@bhargavvader](https://github.com/bhargavvader)
* Chris DuBois, [@chrisdubois](https://github.com/chrisdubois)
* Christoph Schwienheer, [@chssch](https://github.com/chssch)
* Dafne van Kuppevelt, [@dafnevk](https://github.com/dafnevk)
* Dmytro Sadovnychyi, [@sadovnychyi](https://github.com/sadovnychyi)
* György Orosz, [@oroszgy](https://github.com/oroszgy)
* Henning Peters, [@henningpeters](https://github.com/henningpeters)
* Ines Montani, [@ines](https://github.com/ines)
* J Nicolas Schrading, [@NSchrading](https://github.com/NSchrading)
* Janneke van der Zwaan, [@jvdzwaan](https://github.com/jvdzwaan)
* Jordan Suchow, [@suchow](https://github.com/suchow)
* Kendrick Tan, [@kendricktan](https://github.com/kendricktan)
* Kyle P. Johnson, [@kylepjohnson](https://github.com/kylepjohnson)
* Liling Tan, [@alvations](https://github.com/alvations)
* Magnus Burton, [@magnusburton](https://github.com/magnusburton)
* Mark Amery, [@ExplodingCabbage](https://github.com/ExplodingCabbage)
* Matthew Honnibal, [@honnibal](https://github.com/honnibal)
* Maxim Samsonov, [@maxirmx](https://github.com/maxirmx)
* Oleg Zd, [@olegzd](https://github.com/olegzd)
* Pokey Rule, [@pokey](https://github.com/pokey)
* Rob van Nieuwpoort, [@RvanNieuwpoort](https://github.com/RvanNieuwpoort)
* Sam Bozek, [@sambozek](https://github.com/sambozek)
* Sasho Savkov [@savkov](https://github.com/savkov)
* Tiago Rodrigues, [@TiagoMRodrigues](https://github.com/TiagoMRodrigues)
* Vsevolod Solovyov, [@vsolovyov](https://github.com/vsolovyov)
* Wah Loon Keng, [@kengz](https://github.com/kengz)
* Willem van Hage, [@wrvhage](https://github.com/wrvhage)
* Wolfgang Seeker, [@wbwseeker](https://github.com/wbwseeker)
* Yanhao Yang, [@YanhaoYang](https://github.com/YanhaoYang)
* Yubing Dong, [@tomtung](https://github.com/tomtung)

View File

@ -1,8 +1,6 @@
The MIT License (MIT) The MIT License (MIT)
Copyright (C) 2015 Matthew Honnibal Copyright (C) 2016 ExplosionAI UG (haftungsbeschränkt), 2016 spaCy GmbH, 2015 Matthew Honnibal
2016 spaCy GmbH
2016 ExplosionAI UG (haftungsbeschränkt)
Permission is hereby granted, free of charge, to any person obtaining a copy Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal of this software and associated documentation files (the "Software"), to deal

View File

@ -2,52 +2,84 @@ spaCy: Industrial-strength NLP
****************************** ******************************
spaCy is a library for advanced natural language processing in Python and spaCy is a library for advanced natural language processing in Python and
Cython. `See here <https://spacy.io>`_ for documentation and details. spaCy is built on Cython. spaCy is built on the very latest research, but it isn't researchware.
the very latest research, but it isn't researchware. It was designed from day 1 It was designed from day 1 to be used in real products. It's commercial
to be used in real products. It's commercial open-source software, released under open-source software, released under the MIT license.
the MIT license.
💫 **Version 1.5 out now!** `Read the release notes here. <https://github.com/explosion/spaCy/releases/>`_
.. image:: http://i.imgur.com/wFvLZyJ.png .. image:: http://i.imgur.com/wFvLZyJ.png
:target: https://travis-ci.org/explosion/spaCy :target: https://travis-ci.org/explosion/spaCy
:alt: spaCy on Travis CI
.. image:: https://travis-ci.org/explosion/spaCy.svg?branch=master .. image:: https://travis-ci.org/explosion/spaCy.svg?branch=master
:target: https://travis-ci.org/explosion/spaCy :target: https://travis-ci.org/explosion/spaCy
:alt: Build Status
.. image:: https://img.shields.io/github/tag/explosion/spacy.svg .. image:: https://img.shields.io/github/release/explosion/spacy.svg
:target: https://github.com/explosion/spaCy/releases :target: https://github.com/explosion/spaCy/releases
:alt: Current Release Version
.. image:: https://img.shields.io/pypi/v/spacy.svg .. image:: https://img.shields.io/pypi/v/spacy.svg
:target: https://pypi.python.org/pypi/spacy :target: https://pypi.python.org/pypi/spacy
:alt: pypi Version
Where to ask questions .. image:: https://badges.gitter.im/spaCy-users.png
====================== :target: https://gitter.im/explosion/spaCy
:alt: spaCy on Gitter
📖 Documentation
================
+--------------------------------------------------------------------------------+---------------------------------------------------------+
| `Usage Workflows <https://spacy.io/docs/usage/>`_   | How to use spaCy and its features.              |
+--------------------------------------------------------------------------------+---------------------------------------------------------+
| `API Reference <https://spacy.io/docs/api/>`_   | The detailed reference for spaCy's API. |
+--------------------------------------------------------------------------------+---------------------------------------------------------+
| `Tutorials <https://spacy.io/docs/usage/tutorials>`_ | End-to-end examples, with code you can modify and run. |
+--------------------------------------------------------------------------------+---------------------------------------------------------+
| `Showcase & Demos <https://spacy.io/docs/usage/showcase>`_ | Demos, libraries and products from the spaCy community. |
+--------------------------------------------------------------------------------+---------------------------------------------------------+
| `Contribute <https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md>`_ | How to contribute to the spaCy project and code base. |
+--------------------------------------------------------------------------------+---------------------------------------------------------+
💬 Where to ask questions
==========================
+---------------------------+------------------------------------------------------------------------------------------------------------+ +---------------------------+------------------------------------------------------------------------------------------------------------+
| 🔴 **Bug reports**     | `GitHub Issue tracker <https://github.com/explosion/spaCy/issues>`_                                     | | **Bug reports**     | `GitHub Issue tracker <https://github.com/explosion/spaCy/issues>`_                                     |
+---------------------------+------------------------------------------------------------------------------------------------------------+ +---------------------------+------------------------------------------------------------------------------------------------------------+
| ⁉️ **Usage questions**   | `StackOverflow <http://stackoverflow.com/questions/tagged/spacy>`_, `Reddit usergroup                     | | **Usage questions**   | `StackOverflow <http://stackoverflow.com/questions/tagged/spacy>`_, `Reddit usergroup                     |
| | <https://www.reddit.com/r/spacynlp>`_, `Gitter chat <https://gitter.im/spaCy-users>`_ | | | <https://www.reddit.com/r/spacynlp>`_, `Gitter chat <https://gitter.im/explosion/spaCy>`_ |
+---------------------------+------------------------------------------------------------------------------------------------------------+ +---------------------------+------------------------------------------------------------------------------------------------------------+
| 💬 **General discussion** |  `Reddit usergroup <https://www.reddit.com/r/spacynlp>`_, `Gitter chat <https://gitter.im/spaCy-users>`_  | | **General discussion** |  `Reddit usergroup <https://www.reddit.com/r/spacynlp>`_, |
| | `Gitter chat <https://gitter.im/explosion/spaCy>`_  |
+---------------------------+------------------------------------------------------------------------------------------------------------+ +---------------------------+------------------------------------------------------------------------------------------------------------+
| 💥 **Commercial support** | contact@explosion.ai                                                                                     | | **Commercial support** | contact@explosion.ai                                                                                     |
+---------------------------+------------------------------------------------------------------------------------------------------------+ +---------------------------+------------------------------------------------------------------------------------------------------------+
Features Features
======== ========
* Labelled dependency parsing (91.8% accuracy on OntoNotes 5) * Non-destructive **tokenization**
* Named entity recognition (82.6% accuracy on OntoNotes 5) * Syntax-driven sentence segmentation
* Part-of-speech tagging (97.1% accuracy on OntoNotes 5) * Pre-trained **word vectors**
* Easy to use word vectors * Part-of-speech tagging
* All strings mapped to integer IDs * **Named entity** recognition
* Labelled dependency parsing
* Convenient string-to-int mapping
* Export to numpy data arrays * Export to numpy data arrays
* Alignment maintained to original string, ensuring easy mark up calculation * GIL-free **multi-threading**
* Range of easy-to-use orthographic features. * Efficient binary serialization
* No pre-processing required. spaCy takes raw text as input, warts and newlines and all. * Easy **deep learning** integration
* Statistical models for **English** and **German**
* State-of-the-art speed
* Robust, rigorously evaluated accuracy
Top Peformance See `facts, figures and benchmarks <https://spacy.io/docs/api/>`_.
==============
Top Performance
===============
* Fastest in the world: <50ms per document. No faster system has ever been * Fastest in the world: <50ms per document. No faster system has ever been
announced. announced.
@ -59,7 +91,7 @@ Supports
======== ========
* CPython 2.6, 2.7, 3.3, 3.4, 3.5 (only 64 bit) * CPython 2.6, 2.7, 3.3, 3.4, 3.5 (only 64 bit)
* OSX * macOS / OS X
* Linux * Linux
* Windows (Cygwin, MinGW, Visual Studio) * Windows (Cygwin, MinGW, Visual Studio)
@ -67,20 +99,11 @@ Install spaCy
============= =============
spaCy is compatible with 64-bit CPython 2.6+/3.3+ and runs on Unix/Linux, OS X spaCy is compatible with 64-bit CPython 2.6+/3.3+ and runs on Unix/Linux, OS X
and Windows. Source and binary packages are available via and Windows. Source packages are available via
`pip <https://pypi.python.org/pypi/spacy>`_ and `conda <https://anaconda.org/spacy/spacy>`_. `pip <https://pypi.python.org/pypi/spacy>`_. Please make sure that
If there are no binary packages for your platform available please make sure that you have a working build enviroment set up. See notes on Ubuntu, macOS/OS X and Windows
you have a working build enviroment set up. See notes on Ubuntu, OS X and Windows
for details. for details.
conda
-----
.. code:: bash
conda config --add channels spacy # only needed once
conda install spacy
pip pip
--- ---
@ -89,12 +112,6 @@ avoid modifying system state:
.. code:: bash .. code:: bash
# make sure you are using a recent pip/virtualenv version
python -m pip install -U pip virtualenv
virtualenv .env
source .env/bin/activate
pip install spacy pip install spacy
Python packaging is awkward at the best of times, and it's particularly tricky with Python packaging is awkward at the best of times, and it's particularly tricky with
@ -109,17 +126,10 @@ English and German, named ``en`` and ``de``, are available.
.. code:: bash .. code:: bash
python -m spacy.en.download python -m spacy.en.download all
python -m spacy.de.download python -m spacy.de.download all
sputnik --name spacy en_glove_cc_300_1m_vectors # For better word vectors
Then check whether the model was successfully installed: The download command fetches about 1 GB of data which it installs
.. code:: bash
python -c "import spacy; spacy.load('en'); print('OK')"
The download command fetches and installs about 500 MB of data which it installs
within the ``spacy`` package directory. within the ``spacy`` package directory.
Upgrading spaCy Upgrading spaCy
@ -127,13 +137,6 @@ Upgrading spaCy
To upgrade spaCy to the latest release: To upgrade spaCy to the latest release:
conda
-----
.. code:: bash
conda update spacy
pip pip
--- ---
@ -165,14 +168,14 @@ system. See notes on Ubuntu, OS X and Windows for details.
python -m pip install -U pip virtualenv python -m pip install -U pip virtualenv
# find git install instructions at https://git-scm.com/downloads # find git install instructions at https://git-scm.com/downloads
git clone https://github.com/spacy-io/spaCy.git git clone https://github.com/explosion/spaCy.git
cd spaCy cd spaCy
virtualenv .env && source .env/bin/activate virtualenv .env && source .env/bin/activate
pip install -r requirements.txt pip install -r requirements.txt
pip install -e . pip install -e .
Compared to regular install via pip and conda `requirements.txt <requirements.txt>`_ Compared to regular install via pip `requirements.txt <requirements.txt>`_
additionally installs developer dependencies such as cython. additionally installs developer dependencies such as cython.
Ubuntu Ubuntu
@ -184,29 +187,20 @@ Install system-level dependencies via ``apt-get``:
sudo apt-get install build-essential python-dev git sudo apt-get install build-essential python-dev git
OS X macOS / OS X
---- ------------
Install a recent version of XCode, including the so-called "Command Line Tools". Install a recent version of `XCode <https://developer.apple.com/xcode/>`_,
OS X ships with Python and git preinstalled. including the so-called "Command Line Tools". macOS and OS X ship with Python
and git preinstalled.
Windows Windows
------- -------
Install a version of Visual Studio Express or higher that matches the version Install a version of `Visual Studio Express <https://www.visualstudio.com/vs/visual-studio-express/>`_
that was used to compile your Python interpreter. For official distributions or higher that matches the version that was used to compile your Python
these are VS 2008 (Python 2.7), VS 2010 (Python 3.4) and VS 2015 (Python 3.5). interpreter. For official distributions these are VS 2008 (Python 2.7),
VS 2010 (Python 3.4) and VS 2015 (Python 3.5).
Workaround for obsolete system Python
=====================================
If you're stuck using a system with an old version of Python, and you don't
have root access, we've prepared a bootstrap script to help you compile a local
Python install. Run:
.. code:: bash
curl https://raw.githubusercontent.com/spacy-io/gist/master/bootstrap_python_env.sh | bash && source .env/bin/activate
Run tests Run tests
========= =========
@ -228,22 +222,194 @@ and ``--model`` are optional and enable additional tests:
python -m pytest <spacy-directory> --vectors --model --slow python -m pytest <spacy-directory> --vectors --model --slow
API Documentation and Usage Examples Download model to custom location
==================================== =================================
For the detailed documentation, check out the `spaCy website <https://spacy.io/docs/>`_. You can specify where ``spacy.en.download`` and ``spacy.de.download`` download the language model
to using the ``--data-path`` or ``-d`` argument:
* `Usage Examples <https://spacy.io/docs/#examples>`_ .. code:: bash
* `API <https://spacy.io/docs/#api>`_
* `Annotation Specification <https://spacy.io/docs/#annotation>`_
* `Tutorials <https://spacy.io/docs/#tutorials>`_
python -m spacy.en.download all --data-path /some/dir
If you choose to download to a custom location, you will need to tell spaCy where to load the model
from in order to use it. You can do this either by calling ``spacy.util.set_data_path()`` before
calling ``spacy.load()``, or by passing a ``path`` argument to the ``spacy.en.English`` or
``spacy.de.German`` constructors.
Changelog Changelog
========= =========
2016-05-10 `v0.101.0 <../../releases/tag/0.101.0>`_: *Fixed German model* 2016-12-27 `v1.5.0 <https://github.com/explosion/spaCy/releases>`_: *Alpha support for Swedish and Hungarian*
------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------
**✨ Major features and improvements**
* **NEW:** Alpha support for Swedish tokenization.
* **NEW:** Alpha support for Hungarian tokenization.
* Update language data for Spanish tokenization.
* Speed up tokenization when no data is preloaded by caching the first 10,000 vocabulary items seen.
**🔴 Bug fixes**
* List the ``language_data`` package in the ``setup.py``.
* Fix missing ``vec_path`` declaration that was failing if ``add_vectors`` was set.
* Allow ``Vocab`` to load without ``serializer_freqs``.
**📖 Documentation and examples**
* **NEW:** `spaCy Jupyter notebooks <https://github.com/explosion/spacy-notebooks>`_ repo: ongoing collection of easy-to-run spaCy examples and tutorials.
* Fix issue `#657 <https://github.com/explosion/spaCy/issues/657>`_: Generalise dependency parsing `annotation specs <https://spacy.io/docs/api/annotation>`_ beyond English.
* Fix various typos and inconsistencies.
**👥 Contributors**
Thanks to `@oroszgy <https://github.com/oroszgy>`_, `@magnusburton <https://github.com/magnusburton>`_, `@jmizgajski <https://github.com/jmizgajski>`_, `@aikramer2 <https://github.com/aikramer2>`_, `@fnorf <https://github.com/fnorf>`_ and `@bhargavvader <https://github.com/bhargavvader>`_ for the pull requests!
2016-12-18 `v1.4.0 <https://github.com/explosion/spaCy/releases/tag/v1.4.0>`_: *Improved language data and alpha Dutch support*
-------------------------------------------------------------------------------------------------------------------------------
**✨ Major features and improvements**
* **NEW:** Alpha support for Dutch tokenization.
* Reorganise and improve format for language data.
* Add shared tag map, entity rules, emoticons and punctuation to language data.
* Convert entity rules, morphological rules and lemmatization rules from JSON to Python.
* Update language data for English, German, Spanish, French, Italian and Portuguese.
**🔴 Bug fixes**
* Fix issue `#649 <https://github.com/explosion/spaCy/issues/649>`_: Update and reorganise stop lists.
* Fix issue `#672 <https://github.com/explosion/spaCy/issues/672>`_: Make ``token.ent_iob_`` return unicode.
* Fix issue `#674 <https://github.com/explosion/spaCy/issues/674>`_: Add missing lemmas for contracted forms of "be" to ``TOKENIZER_EXCEPTIONS``.
* Fix issue `#683 <https://github.com/explosion/spaCy/issues/683>`_ ``Morphology`` class now supplies tag map value for the special space tag if it's missing.
* Fix issue `#684 <https://github.com/explosion/spaCy/issues/684>`_: Ensure ``spacy.en.English()`` loads the Glove vector data if available. Previously was inconsistent with behaviour of ``spacy.load('en')``.
* Fix issue `#685 <https://github.com/explosion/spaCy/issues/685>`_: Expand ``TOKENIZER_EXCEPTIONS`` with unicode apostrophe (````).
* Fix issue `#689 <https://github.com/explosion/spaCy/issues/689>`_: Correct typo in ``STOP_WORDS``.
* Fix issue `#691 <https://github.com/explosion/spaCy/issues/691>`_: Add tokenizer exceptions for "gonna" and "Gonna".
**⚠️ Backwards incompatibilities**
No changes to the public, documented API, but the previously undocumented language data and model initialisation processes have been refactored and reorganised. If you were relying on the ``bin/init_model.py`` script, see the new `spaCy Developer Resources <https://github.com/explosion/spacy-dev-resources>`_ repo. Code that references internals of the ``spacy.en`` or ``spacy.de`` packages should also be reviewed before updating to this version.
**📖 Documentation and examples**
* **NEW:** `"Adding languages" <https://spacy.io/docs/usage/adding-languages>`_ workflow.
* **NEW:** `"Part-of-speech tagging" <https://spacy.io/docs/usage/pos-tagging>`_ workflow.
* **NEW:** `spaCy Developer Resources <https://github.com/explosion/spacy-dev-resources>`_ repo scripts, tools and resources for developing spaCy.
* Fix various typos and inconsistencies.
**👥 Contributors**
Thanks to `@dafnevk <https://github.com/dafnevk>`_, `@jvdzwaan <https://github.com/jvdzwaan>`_, `@RvanNieuwpoort <https://github.com/RvanNieuwpoort>`_, `@wrvhage <https://github.com/wrvhage>`_, `@jaspb <https://github.com/jaspb>`_, `@savvopoulos <https://github.com/savvopoulos>`_ and `@davedwards <https://github.com/davedwards>`_ for the pull requests!
2016-12-03 `v1.3.0 <https://github.com/explosion/spaCy/releases/tag/v1.3.0>`_: *Improve API consistency*
--------------------------------------------------------------------------------------------------------
**✨ API improvements**
* Add ``Span.sentiment`` attribute.
* `#658 <https://github.com/explosion/spaCy/pull/658>`_: Add ``Span.noun_chunks`` iterator (thanks `@pokey <https://github.com/pokey>`_).
* `#642 <https://github.com/explosion/spaCy/pull/642>`_: Let ``--data-path`` be specified when running download.py scripts (thanks `@ExplodingCabbage <https://github.com/ExplodingCabbage>`_).
* `#638 <https://github.com/explosion/spaCy/pull/638>`_: Add German stopwords (thanks `@souravsingh <https://github.com/souravsingh>`_).
* `#614 <https://github.com/explosion/spaCy/pull/614>`_: Fix ``PhraseMatcher`` to work with new ``Matcher`` (thanks `@sadovnychyi <https://github.com/sadovnychyi>`_).
**🔴 Bug fixes**
* Fix issue `#605 <https://github.com/explosion/spaCy/issues/605>`_: ``accept`` argument to ``Matcher`` now rejects matches as expected.
* Fix issue `#617 <https://github.com/explosion/spaCy/issues/617>`_: ``Vocab.load()`` now works with string paths, as well as ``Path`` objects.
* Fix issue `#639 <https://github.com/explosion/spaCy/issues/639>`_: Stop words in ``Language`` class now used as expected.
* Fix issues `#656 <https://github.com/explosion/spaCy/issues/656>`_, `#624 <https://github.com/explosion/spaCy/issues/624>`_: ``Tokenizer`` special-case rules now support arbitrary token attributes.
**📖 Documentation and examples**
* Add `"Customizing the tokenizer" <https://spacy.io/docs/usage/customizing-tokenizer>`_ workflow.
* Add `"Training the tagger, parser and entity recognizer" <https://spacy.io/docs/usage/training>`_ workflow.
* Add `"Entity recognition" <https://spacy.io/docs/usage/entity-recognition>`_ workflow.
* Fix various typos and inconsistencies.
**👥 Contributors**
Thanks to `@pokey <https://github.com/pokey>`_, `@ExplodingCabbage <https://github.com/ExplodingCabbage>`_, `@souravsingh <https://github.com/souravsingh>`_, `@sadovnychyi <https://github.com/sadovnychyi>`_, `@manojsakhwar <https://github.com/manojsakhwar>`_, `@TiagoMRodrigues <https://github.com/TiagoMRodrigues>`_, `@savkov <https://github.com/savkov>`_, `@pspiegelhalter <https://github.com/pspiegelhalter>`_, `@chenb67 <https://github.com/chenb67>`_, `@kylepjohnson <https://github.com/kylepjohnson>`_, `@YanhaoYang <https://github.com/YanhaoYang>`_, `@tjrileywisc <https://github.com/tjrileywisc>`_, `@dechov <https://github.com/dechov>`_, `@wjt <https://github.com/wjt>`_, `@jsmootiv <https://github.com/jsmootiv>`_ and `@blarghmatey <https://github.com/blarghmatey>`_ for the pull requests!
2016-11-04 `v1.2.0 <https://github.com/explosion/spaCy/releases/tag/v1.2.0>`_: *Alpha tokenizers for Chinese, French, Spanish, Italian and Portuguese*
------------------------------------------------------------------------------------------------------------------------------------------------------
**✨ Major features and improvements**
* **NEW:** Support Chinese tokenization, via `Jieba <https://github.com/fxsjy/jieba>`_.
* **NEW:** Alpha support for French, Spanish, Italian and Portuguese tokenization.
**🔴 Bug fixes**
* Fix issue `#376 <https://github.com/explosion/spaCy/issues/376>`_: POS tags for "and/or" are now correct.
* Fix issue `#578 <https://github.com/explosion/spaCy/issues578/>`_: ``--force`` argument on download command now operates correctly.
* Fix issue `#595 <https://github.com/explosion/spaCy/issues/595>`_: Lemmatization corrected for some base forms.
* Fix issue `#588 <https://github.com/explosion/spaCy/issues/588>`_: `Matcher` now rejects empty patterns.
* Fix issue `#592 <https://github.com/explosion/spaCy/issues/592>`_: Added exception rule for tokenization of "Ph.D."
* Fix issue `#599 <https://github.com/explosion/spaCy/issues/599>`_: Empty documents now considered tagged and parsed.
* Fix issue `#600 <https://github.com/explosion/spaCy/issues/600>`_: Add missing ``token.tag`` and ``token.tag_`` setters.
* Fix issue `#596 <https://github.com/explosion/spaCy/issues/596>`_: Added missing unicode import when compiling regexes that led to incorrect tokenization.
* Fix issue `#587 <https://github.com/explosion/spaCy/issues/587>`_: Resolved bug that caused ``Matcher`` to sometimes segfault.
* Fix issue `#429 <https://github.com/explosion/spaCy/issues/429>`_: Ensure missing entity types are added to the entity recognizer.
2016-10-23 `v1.1.0 <https://github.com/explosion/spaCy/releases/tag/v1.1.0>`_: *Bug fixes and adjustments*
----------------------------------------------------------------------------------------------------------
* Rename new ``pipeline`` keyword argument of ``spacy.load()`` to ``create_pipeline``.
* Rename new ``vectors`` keyword argument of ``spacy.load()`` to ``add_vectors``.
**🔴 Bug fixes**
* Fix issue `#544 <https://github.com/explosion/spaCy/issues/544>`_: Add ``vocab.resize_vectors()`` method, to support changing to vectors of different dimensionality.
* Fix issue `#536 <https://github.com/explosion/spaCy/issues/536>`_: Default probability was incorrect for OOV words.
* Fix issue `#539 <https://github.com/explosion/spaCy/issues/539>`_: Unspecified encoding when opening some JSON files.
* Fix issue `#541 <https://github.com/explosion/spaCy/issues/541>`_: GloVe vectors were being loaded incorrectly.
* Fix issue `#522 <https://github.com/explosion/spaCy/issues/522>`_: Similarities and vector norms were calculated incorrectly.
* Fix issue `#461 <https://github.com/explosion/spaCy/issues/461>`_: ``ent_iob`` attribute was incorrect after setting entities via ``doc.ents``
* Fix issue `#459 <https://github.com/explosion/spaCy/issues/459>`_: Deserialiser failed on empty doc
* Fix issue `#514 <https://github.com/explosion/spaCy/issues/514>`_: Serialization failed after adding a new entity label.
2016-10-18 `v1.0.0 <https://github.com/explosion/spaCy/releases/tag/v1.0.0>`_: *Support for deep learning workflows and entity-aware rule matcher*
--------------------------------------------------------------------------------------------------------------------------------------------------
**✨ Major features and improvements**
* **NEW:** `custom processing pipelines <https://spacy.io/docs/usage/customizing-pipeline>`_, to support deep learning workflows
* **NEW:** `Rule matcher <https://spacy.io/docs/usage/rule-based-matching>`_ now supports entity IDs and attributes
* **NEW:** Official/documented `training APIs <https://github.com/explosion/spaCy/tree/master/examples/training>`_ and `GoldParse` class
* Download and use GloVe vectors by default
* Make it easier to load and unload word vectors
* Improved rule matching functionality
* Move basic data into the code, rather than the json files. This makes it simpler to use the tokenizer without the models installed, and makes adding new languages much easier.
* Replace file-system strings with ``Path`` objects. You can now load resources over your network, or do similar trickery, by passing any object that supports the ``Path`` protocol.
**⚠️ Backwards incompatibilities**
* The data_dir keyword argument of ``Language.__init__`` (and its subclasses ``English.__init__`` and ``German.__init__``) has been renamed to ``path``.
* Details of how the Language base-class and its sub-classes are loaded, and how defaults are accessed, have been heavily changed. If you have your own subclasses, you should review the changes.
* The deprecated ``token.repvec`` name has been removed.
* The ``.train()`` method of Tagger and Parser has been renamed to ``.update()``
* The previously undocumented ``GoldParse`` class has a new ``__init__()`` method. The old method has been preserved in ``GoldParse.from_annot_tuples()``.
* Previously undocumented details of the ``Parser`` class have changed.
* The previously undocumented ``get_package`` and ``get_package_by_name`` helper functions have been moved into a new module, ``spacy.deprecated``, in case you still need them while you update.
**🔴 Bug fixes**
* Fix ``get_lang_class`` bug when GloVe vectors are used.
* Fix Issue `#411 <https://github.com/explosion/spaCy/issues/411>`_: ``doc.sents`` raised IndexError on empty string.
* Fix Issue `#455 <https://github.com/explosion/spaCy/issues/455>`_: Correct lemmatization logic
* Fix Issue `#371 <https://github.com/explosion/spaCy/issues/371>`_: Make ``Lexeme`` objects hashable
* Fix Issue `#469 <https://github.com/explosion/spaCy/issues/469>`_: Make ``noun_chunks`` detect root NPs
**👥 Contributors**
Thanks to `@daylen <https://github.com/daylen>`_, `@RahulKulhari <https://github.com/RahulKulhari>`_, `@stared <https://github.com/stared>`_, `@adamhadani <https://github.com/adamhadani>`_, `@izeye <https://github.com/adamhadani>`_ and `@crawfordcomeaux <https://github.com/adamhadani>`_ for the pull requests!
2016-05-10 `v0.101.0 <https://github.com/explosion/spaCy/releases/tag/0.101.0>`_: *Fixed German model*
------------------------------------------------------------------------------------------------------
* Fixed bug that prevented German parses from being deprojectivised. * Fixed bug that prevented German parses from being deprojectivised.
* Bug fixes to sentence boundary detection. * Bug fixes to sentence boundary detection.
@ -251,8 +417,8 @@ Changelog
* Add missing ``Doc.has_vector`` and ``Span.has_vector`` properties. * Add missing ``Doc.has_vector`` and ``Span.has_vector`` properties.
* Add missing ``Span.sent`` property. * Add missing ``Span.sent`` property.
2016-05-05 `v0.100.7 <../../releases/tag/0.100.7>`_: *German!* 2016-05-05 `v0.100.7 <https://github.com/explosion/spaCy/releases/tag/0.100.7>`_: *German!*
-------------------------------------------------------------- -------------------------------------------------------------------------------------------
spaCy finally supports another language, in addition to English. We're lucky spaCy finally supports another language, in addition to English. We're lucky
to have Wolfgang Seeker on the team, and the new German model is just the to have Wolfgang Seeker on the team, and the new German model is just the
@ -296,13 +462,14 @@ and it doesn't yet recognise numeric entities such as numbers and dates.
* Fix bug that led to inconsistent sentence boundaries before and after serialisation. * Fix bug that led to inconsistent sentence boundaries before and after serialisation.
* Fix bug from deserialising untagged documents. * Fix bug from deserialising untagged documents.
2016-03-08 `v0.100.6 <../../releases/tag/0.100.6>`_: *Add support for GloVe vectors* 2016-03-08 `v0.100.6 <https://github.com/explosion/spaCy/releases/tag/0.100.6>`_: *Add support for GloVe vectors*
------------------------------------------------------------------------------------ -----------------------------------------------------------------------------------------------------------------
This release offers improved support for replacing the word vectors used by spaCy. This release offers improved support for replacing the word vectors used by spaCy.
To install Stanford's GloVe vectors, trained on the Common Crawl, just run: To install Stanford's GloVe vectors, trained on the Common Crawl, just run:
.. code:: bash .. code:: bash
sputnik --name spacy install en_glove_cc_300_1m_vectors sputnik --name spacy install en_glove_cc_300_1m_vectors
To reduce memory usage and loading time, we've trimmed the vocabulary down to 1m entries. To reduce memory usage and loading time, we've trimmed the vocabulary down to 1m entries.
@ -312,20 +479,21 @@ will be released shortly. To assist in multi-lingual processing, we've added a `
function. To load the English model with the GloVe vectors: function. To load the English model with the GloVe vectors:
.. code:: python .. code:: python
spacy.load('en', vectors='en_glove_cc_300_1m_vectors') spacy.load('en', vectors='en_glove_cc_300_1m_vectors')
2016-02-07 `v0.100.5 <../../releases/tag/0.100.5>`_ 2016-02-07 `v0.100.5 <https://github.com/explosion/spaCy/releases/tag/0.100.5>`_
--------------------------------------------------- --------------------------------------------------------------------------------
Fix incorrect use of header file, caused from problem with thinc Fix incorrect use of header file, caused from problem with thinc
2016-02-07 `v0.100.4 <../../releases/tag/0.100.4>`_: *Fix OSX problem introduced in 0.100.3* 2016-02-07 `v0.100.4 <https://github.com/explosion/spaCy/releases/tag/0.100.4>`_: *Fix OSX problem introduced in 0.100.3*
-------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------
Small correction to right_edge calculation Small correction to right_edge calculation
2016-02-06 `v0.100.3 <../../releases/tag/0.100.3>`_ 2016-02-06 `v0.100.3 <https://github.com/explosion/spaCy/releases/tag/0.100.3>`_
--------------------------------------------------- --------------------------------------------------------------------------------
Support multi-threading, via the ``.pipe`` method. spaCy now releases the GIL around the Support multi-threading, via the ``.pipe`` method. spaCy now releases the GIL around the
parser and entity recognizer, so systems that support OpenMP should be able to do parser and entity recognizer, so systems that support OpenMP should be able to do
@ -333,20 +501,20 @@ shared memory parallelism at close to full efficiency.
We've also greatly reduced loading time, and fixed a number of bugs. We've also greatly reduced loading time, and fixed a number of bugs.
2016-01-21 `v0.100.2 <../../releases/tag/0.100.2>`_ 2016-01-21 `v0.100.2 <https://github.com/explosion/spaCy/releases/tag/0.100.2>`_
--------------------------------------------------- --------------------------------------------------------------------------------
Fix data version lock that affected v0.100.1 Fix data version lock that affected v0.100.1
2016-01-21 `v0.100.1 <../../releases/tag/0.100.1>`_: *Fix install for OSX* 2016-01-21 `v0.100.1 <https://github.com/explosion/spaCy/releases/tag/0.100.1>`_: *Fix install for OSX*
-------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------
v0.100 included header files built on Linux that caused installation to fail on OSX. v0.100 included header files built on Linux that caused installation to fail on OSX.
This should now be corrected. We also update the default data distribution, to This should now be corrected. We also update the default data distribution, to
include a small fix to the tokenizer. include a small fix to the tokenizer.
2016-01-19 `v0.100 <../../releases/tag/0.100>`_: *Revise setup.py, better model downloads, bug fixes* 2016-01-19 `v0.100 <https://github.com/explosion/spaCy/releases/tag/0.100>`_: *Revise setup.py, better model downloads, bug fixes*
----------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------
* Redo setup.py, and remove ugly headers_workaround hack. Should result in fewer install problems. * Redo setup.py, and remove ugly headers_workaround hack. Should result in fewer install problems.
* Update data downloading and installation functionality, by migrating to the Sputnik data-package manager. This will allow us to offer finer grained control of data installation in future. * Update data downloading and installation functionality, by migrating to the Sputnik data-package manager. This will allow us to offer finer grained control of data installation in future.
@ -357,16 +525,16 @@ include a small fix to the tokenizer.
* Fix problem that caused ``doc.merge()`` to sometimes hang * Fix problem that caused ``doc.merge()`` to sometimes hang
* Fix problems in handling of whitespace * Fix problems in handling of whitespace
2015-11-08 `v0.99 <../../releases/tag/0.99>`_: *Improve span merging, internal refactoring* 2015-11-08 `v0.99 <https://github.com/explosion/spaCy/releases/tag/0.99>`_: *Improve span merging, internal refactoring*
------------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------
* Merging multi-word tokens into one, via the ``doc.merge()`` and ``span.merge()`` methods, no longer invalidates existing ``Span`` objects. This makes it much easier to merge multiple spans, e.g. to merge all named entities, or all base noun phrases. Thanks to @andreasgrv for help on this patch. * Merging multi-word tokens into one, via the ``doc.merge()`` and ``span.merge()`` methods, no longer invalidates existing ``Span`` objects. This makes it much easier to merge multiple spans, e.g. to merge all named entities, or all base noun phrases. Thanks to @andreasgrv for help on this patch.
* Lots of internal refactoring, especially around the machine learning module, thinc. The thinc API has now been improved, and the spacy._ml wrapper module is no longer necessary. * Lots of internal refactoring, especially around the machine learning module, thinc. The thinc API has now been improved, and the spacy._ml wrapper module is no longer necessary.
* The lemmatizer now lower-cases non-noun, noun-verb and non-adjective words. * The lemmatizer now lower-cases non-noun, noun-verb and non-adjective words.
* A new attribute, ``.rank``, is added to Token and Lexeme objects, giving the frequency rank of the word. * A new attribute, ``.rank``, is added to Token and Lexeme objects, giving the frequency rank of the word.
2015-11-03 `v0.98 <../../releases/tag/0.98>`_: *Smaller package, bug fixes* 2015-11-03 `v0.98 <https://github.com/explosion/spaCy/releases/tag/0.98>`_: *Smaller package, bug fixes*
--------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------
* Remove binary data from PyPi package. * Remove binary data from PyPi package.
* Delete archive after downloading data * Delete archive after downloading data
@ -374,36 +542,36 @@ include a small fix to the tokenizer.
* Fix information loss in deserialize * Fix information loss in deserialize
* Fix ``__str__`` methods for Python2 * Fix ``__str__`` methods for Python2
2015-10-23 `v0.97 <../../releases/tag/0.97>`_: *Load the StringStore from a json list, instead of a text file* 2015-10-23 `v0.97 <https://github.com/explosion/spaCy/releases/tag/0.97>`_: *Load the StringStore from a json list, instead of a text file*
-------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------
* Fix bugs in download.py * Fix bugs in download.py
* Require ``--force`` to over-write the data directory in download.py * Require ``--force`` to over-write the data directory in download.py
* Fix bugs in ``Matcher`` and ``doc.merge()`` * Fix bugs in ``Matcher`` and ``doc.merge()``
2015-10-19 `v0.96 <../../releases/tag/0.96>`_: *Hotfix to .merge method* 2015-10-19 `v0.96 <https://github.com/explosion/spaCy/releases/tag/0.96>`_: *Hotfix to .merge method*
------------------------------------------------------------------------ -----------------------------------------------------------------------------------------------------
* Fix bug that caused text to be lost after ``.merge`` * Fix bug that caused text to be lost after ``.merge``
* Fix bug in Matcher when matched entities overlapped * Fix bug in Matcher when matched entities overlapped
2015-10-18 `v0.95 <../../releases/tag/0.95>`_: *Bugfixes* 2015-10-18 `v0.95 <https://github.com/explosion/spaCy/releases/tag/0.95>`_: *Bugfixes*
--------------------------------------------------------- --------------------------------------------------------------------------------------
* Reform encoding of symbols * Reform encoding of symbols
* Fix bugs in ``Matcher`` * Fix bugs in ``Matcher``
* Fix bugs in ``Span`` * Fix bugs in ``Span``
* Add tokenizer rule to fix numeric range tokenization * Add tokenizer rule to fix numeric range tokenization
* Add specific string-length cap in Tokenizer * Add specific string-length cap in Tokenizer
* Fix ``token.conjuncts``` * Fix ``token.conjuncts``
2015-10-09 `v0.94 <../../releases/tag/0.94>`_ 2015-10-09 `v0.94 <https://github.com/explosion/spaCy/releases/tag/0.94>`_
--------------------------------------------- --------------------------------------------------------------------------
* Fix memory error that caused crashes on 32bit platforms * Fix memory error that caused crashes on 32bit platforms
* Fix parse errors caused by smart quotes and em-dashes * Fix parse errors caused by smart quotes and em-dashes
2015-09-22 `v0.93 <../../releases/tag/0.93>`_ 2015-09-22 `v0.93 <https://github.com/explosion/spaCy/releases/tag/0.93>`_
--------------------------------------------- --------------------------------------------------------------------------
Bug fixes to word vectors Bug fixes to word vectors

View File

@ -1,229 +0,0 @@
"""Set up a model directory.
Requires:
lang_data --- Rules for the tokenizer
* prefix.txt
* suffix.txt
* infix.txt
* morphs.json
* specials.json
corpora --- Data files
* WordNet
* words.sgt.prob --- Smoothed unigram probabilities
* clusters.txt --- Output of hierarchical clustering, e.g. Brown clusters
* vectors.bz2 --- output of something like word2vec, compressed with bzip
"""
from __future__ import unicode_literals
from ast import literal_eval
import math
import gzip
import json
import plac
from pathlib import Path
from shutil import copyfile
from shutil import copytree
from collections import defaultdict
import io
from spacy.vocab import Vocab
from spacy.vocab import write_binary_vectors
from spacy.strings import hash_string
from preshed.counter import PreshCounter
from spacy.parts_of_speech import NOUN, VERB, ADJ
from spacy.util import get_lang_class
try:
unicode
except NameError:
unicode = str
def setup_tokenizer(lang_data_dir, tok_dir):
if not tok_dir.exists():
tok_dir.mkdir()
for filename in ('infix.txt', 'morphs.json', 'prefix.txt', 'specials.json',
'suffix.txt'):
src = lang_data_dir / filename
dst = tok_dir / filename
copyfile(str(src), str(dst))
def _read_clusters(loc):
if not loc.exists():
print("Warning: Clusters file not found")
return {}
clusters = {}
for line in io.open(str(loc), 'r', encoding='utf8'):
try:
cluster, word, freq = line.split()
except ValueError:
continue
# If the clusterer has only seen the word a few times, its cluster is
# unreliable.
if int(freq) >= 3:
clusters[word] = cluster
else:
clusters[word] = '0'
# Expand clusters with re-casing
for word, cluster in list(clusters.items()):
if word.lower() not in clusters:
clusters[word.lower()] = cluster
if word.title() not in clusters:
clusters[word.title()] = cluster
if word.upper() not in clusters:
clusters[word.upper()] = cluster
return clusters
def _read_probs(loc):
if not loc.exists():
print("Probabilities file not found. Trying freqs.")
return {}, 0.0
probs = {}
for i, line in enumerate(io.open(str(loc), 'r', encoding='utf8')):
prob, word = line.split()
prob = float(prob)
probs[word] = prob
return probs, probs['-OOV-']
def _read_freqs(loc, max_length=100, min_doc_freq=5, min_freq=200):
if not loc.exists():
print("Warning: Frequencies file not found")
return {}, 0.0
counts = PreshCounter()
total = 0
if str(loc).endswith('gz'):
file_ = gzip.open(str(loc))
else:
file_ = loc.open()
for i, line in enumerate(file_):
freq, doc_freq, key = line.rstrip().split('\t', 2)
freq = int(freq)
counts.inc(i+1, freq)
total += freq
counts.smooth()
log_total = math.log(total)
if str(loc).endswith('gz'):
file_ = gzip.open(str(loc))
else:
file_ = loc.open()
probs = {}
for line in file_:
freq, doc_freq, key = line.rstrip().split('\t', 2)
doc_freq = int(doc_freq)
freq = int(freq)
if doc_freq >= min_doc_freq and freq >= min_freq and len(key) < max_length:
word = literal_eval(key)
smooth_count = counts.smoother(int(freq))
probs[word] = math.log(smooth_count) - log_total
oov_prob = math.log(counts.smoother(0)) - log_total
return probs, oov_prob
def _read_senses(loc):
lexicon = defaultdict(lambda: defaultdict(list))
if not loc.exists():
print("Warning: WordNet senses not found")
return lexicon
sense_names = dict((s, i) for i, s in enumerate(spacy.senses.STRINGS))
pos_ids = {'noun': NOUN, 'verb': VERB, 'adjective': ADJ}
for line in codecs.open(str(loc), 'r', 'utf8'):
sense_strings = line.split()
word = sense_strings.pop(0)
for sense in sense_strings:
pos, sense = sense[3:].split('.')
sense_name = '%s_%s' % (pos[0].upper(), sense.lower())
if sense_name != 'N_tops':
sense_id = sense_names[sense_name]
lexicon[word][pos_ids[pos]].append(sense_id)
return lexicon
def setup_vocab(lex_attr_getters, tag_map, src_dir, dst_dir):
if not dst_dir.exists():
dst_dir.mkdir()
vectors_src = src_dir / 'vectors.bz2'
if vectors_src.exists():
write_binary_vectors(vectors_src.as_posix, (dst_dir / 'vec.bin').as_posix())
else:
print("Warning: Word vectors file not found")
vocab = Vocab(lex_attr_getters=lex_attr_getters, tag_map=tag_map)
clusters = _read_clusters(src_dir / 'clusters.txt')
probs, oov_prob = _read_probs(src_dir / 'words.sgt.prob')
if not probs:
probs, oov_prob = _read_freqs(src_dir / 'freqs.txt.gz')
if not probs:
oov_prob = -20
else:
oov_prob = min(probs.values())
for word in clusters:
if word not in probs:
probs[word] = oov_prob
lexicon = []
for word, prob in reversed(sorted(list(probs.items()), key=lambda item: item[1])):
# First encode the strings into the StringStore. This way, we can map
# the orth IDs to frequency ranks
orth = vocab.strings[word]
# Now actually load the vocab
for word, prob in reversed(sorted(list(probs.items()), key=lambda item: item[1])):
lexeme = vocab[word]
lexeme.prob = prob
lexeme.is_oov = False
# Decode as a little-endian string, so that we can do & 15 to get
# the first 4 bits. See _parse_features.pyx
if word in clusters:
lexeme.cluster = int(clusters[word][::-1], 2)
else:
lexeme.cluster = 0
vocab.dump((dst_dir / 'lexemes.bin').as_posix())
with (dst_dir / 'strings.json').open('w') as file_:
vocab.strings.dump(file_)
with (dst_dir / 'oov_prob').open('w') as file_:
file_.write('%f' % oov_prob)
def main(lang_id, lang_data_dir, corpora_dir, model_dir):
model_dir = Path(model_dir)
lang_data_dir = Path(lang_data_dir) / lang_id
corpora_dir = Path(corpora_dir) / lang_id
assert corpora_dir.exists()
assert lang_data_dir.exists()
if not model_dir.exists():
model_dir.mkdir()
tag_map = json.load((lang_data_dir / 'tag_map.json').open())
setup_tokenizer(lang_data_dir, model_dir / 'tokenizer')
setup_vocab(get_lang_class(lang_id).Defaults.lex_attr_getters, tag_map, corpora_dir,
model_dir / 'vocab')
if (lang_data_dir / 'gazetteer.json').exists():
copyfile((lang_data_dir / 'gazetteer.json').as_posix(),
(model_dir / 'vocab' / 'gazetteer.json').as_posix())
copyfile((lang_data_dir / 'tag_map.json').as_posix(),
(model_dir / 'vocab' / 'tag_map.json').as_posix())
if (lang_data_dir / 'lemma_rules.json').exists():
copyfile((lang_data_dir / 'lemma_rules.json').as_posix(),
(model_dir / 'vocab' / 'lemma_rules.json').as_posix())
if not (model_dir / 'wordnet').exists() and (corpora_dir / 'wordnet').exists():
copytree((corpora_dir / 'wordnet' / 'dict').as_posix(),
(model_dir / 'wordnet').as_posix())
if __name__ == '__main__':
plac.call(main)

View File

@ -17,6 +17,7 @@ import spacy.util
from spacy.syntax.util import Config from spacy.syntax.util import Config
from spacy.gold import read_json_file from spacy.gold import read_json_file
from spacy.gold import GoldParse from spacy.gold import GoldParse
from spacy.gold import merge_sents
from spacy.scorer import Scorer from spacy.scorer import Scorer
@ -63,96 +64,24 @@ def score_model(scorer, nlp, raw_text, annot_tuples, verbose=False):
scorer.score(tokens, gold, verbose=verbose) scorer.score(tokens, gold, verbose=verbose)
def _merge_sents(sents): def train(Language, train_data, dev_data, model_dir, tagger_cfg, parser_cfg, entity_cfg,
m_deps = [[], [], [], [], [], []] n_iter=15, seed=0, gold_preproc=False, n_sents=0, corruption_level=0):
m_brackets = []
i = 0
for (ids, words, tags, heads, labels, ner), brackets in sents:
m_deps[0].extend(id_ + i for id_ in ids)
m_deps[1].extend(words)
m_deps[2].extend(tags)
m_deps[3].extend(head + i for head in heads)
m_deps[4].extend(labels)
m_deps[5].extend(ner)
m_brackets.extend((b['first'] + i, b['last'] + i, b['label']) for b in brackets)
i += len(ids)
return [(m_deps, m_brackets)]
def train(Language, gold_tuples, model_dir, n_iter=15, feat_set=u'basic',
seed=0, gold_preproc=False, n_sents=0, corruption_level=0,
beam_width=1, verbose=False,
use_orig_arc_eager=False, pseudoprojective=False):
dep_model_dir = path.join(model_dir, 'deps')
ner_model_dir = path.join(model_dir, 'ner')
pos_model_dir = path.join(model_dir, 'pos')
if path.exists(dep_model_dir):
shutil.rmtree(dep_model_dir)
if path.exists(ner_model_dir):
shutil.rmtree(ner_model_dir)
if path.exists(pos_model_dir):
shutil.rmtree(pos_model_dir)
os.mkdir(dep_model_dir)
os.mkdir(ner_model_dir)
os.mkdir(pos_model_dir)
if pseudoprojective:
# preprocess training data here before ArcEager.get_labels() is called
gold_tuples = PseudoProjectivity.preprocess_training_data(gold_tuples)
Config.write(dep_model_dir, 'config', features=feat_set, seed=seed,
labels=ArcEager.get_labels(gold_tuples),
beam_width=beam_width,projectivize=pseudoprojective)
Config.write(ner_model_dir, 'config', features='ner', seed=seed,
labels=BiluoPushDown.get_labels(gold_tuples),
beam_width=0)
if n_sents > 0:
gold_tuples = gold_tuples[:n_sents]
nlp = Language(data_dir=model_dir, tagger=False, parser=False, entity=False)
nlp.tagger = Tagger.blank(nlp.vocab, Tagger.default_templates())
nlp.parser = Parser.from_dir(dep_model_dir, nlp.vocab.strings, ArcEager)
nlp.entity = Parser.from_dir(ner_model_dir, nlp.vocab.strings, BiluoPushDown)
print("Itn.\tP.Loss\tUAS\tNER F.\tTag %\tToken %") print("Itn.\tP.Loss\tUAS\tNER F.\tTag %\tToken %")
for itn in range(n_iter): format_str = '{:d}\t{:d}\t{uas:.3f}\t{ents_f:.3f}\t{tags_acc:.3f}\t{token_acc:.3f}'
scorer = Scorer() with Language.train(model_dir, train_data,
tagger_cfg, parser_cfg, entity_cfg) as trainer:
loss = 0 loss = 0
for raw_text, sents in gold_tuples: for itn, epoch in enumerate(trainer.epochs(n_iter, gold_preproc=gold_preproc,
if gold_preproc: augment_data=None)):
raw_text = None for doc, gold in epoch:
else: trainer.update(doc, gold)
sents = _merge_sents(sents) dev_scores = trainer.evaluate(dev_data, gold_preproc=gold_preproc)
for annot_tuples, ctnt in sents: print(format_str.format(itn, loss, **dev_scores.scores))
if len(annot_tuples[1]) == 1:
continue
score_model(scorer, nlp, raw_text, annot_tuples,
verbose=verbose if itn >= 2 else False)
if raw_text is None:
words = add_noise(annot_tuples[1], corruption_level)
tokens = nlp.tokenizer.tokens_from_list(words)
else:
raw_text = add_noise(raw_text, corruption_level)
tokens = nlp.tokenizer(raw_text)
nlp.tagger(tokens)
gold = GoldParse(tokens, annot_tuples)
if not gold.is_projective:
raise Exception("Non-projective sentence in training: %s" % annot_tuples[1])
loss += nlp.parser.train(tokens, gold)
nlp.entity.train(tokens, gold)
nlp.tagger.train(tokens, gold.tags)
random.shuffle(gold_tuples)
print('%d:\t%d\t%.3f\t%.3f\t%.3f\t%.3f' % (itn, loss, scorer.uas, scorer.ents_f,
scorer.tags_acc,
scorer.token_acc))
print('end training')
nlp.end_training(model_dir)
print('done')
def evaluate(Language, gold_tuples, model_dir, gold_preproc=False, verbose=False, def evaluate(Language, gold_tuples, model_dir, gold_preproc=False, verbose=False,
beam_width=None, cand_preproc=None): beam_width=None, cand_preproc=None):
nlp = Language(data_dir=model_dir) nlp = Language(path=model_dir)
if nlp.lang == 'de': if nlp.lang == 'de':
nlp.vocab.morphology.lemmatizer = lambda string,pos: set([string]) nlp.vocab.morphology.lemmatizer = lambda string,pos: set([string])
if beam_width is not None: if beam_width is not None:
@ -162,7 +91,7 @@ def evaluate(Language, gold_tuples, model_dir, gold_preproc=False, verbose=False
if gold_preproc: if gold_preproc:
raw_text = None raw_text = None
else: else:
sents = _merge_sents(sents) sents = merge_sents(sents)
for annot_tuples, brackets in sents: for annot_tuples, brackets in sents:
if raw_text is None: if raw_text is None:
tokens = nlp.tokenizer.tokens_from_list(annot_tuples[1]) tokens = nlp.tokenizer.tokens_from_list(annot_tuples[1])
@ -171,7 +100,7 @@ def evaluate(Language, gold_tuples, model_dir, gold_preproc=False, verbose=False
nlp.entity(tokens) nlp.entity(tokens)
else: else:
tokens = nlp(raw_text) tokens = nlp(raw_text)
gold = GoldParse(tokens, annot_tuples) gold = GoldParse.from_annot_tuples(tokens, annot_tuples)
scorer.score(tokens, gold, verbose=verbose) scorer.score(tokens, gold, verbose=verbose)
return scorer return scorer
@ -219,15 +148,21 @@ def write_parses(Language, dev_loc, model_dir, out_loc):
) )
def main(language, train_loc, dev_loc, model_dir, n_sents=0, n_iter=15, out_loc="", verbose=False, def main(language, train_loc, dev_loc, model_dir, n_sents=0, n_iter=15, out_loc="", verbose=False,
debug=False, corruption_level=0.0, gold_preproc=False, eval_only=False, pseudoprojective=False): debug=False, corruption_level=0.0, gold_preproc=False, eval_only=False, pseudoprojective=False):
parser_cfg = dict(locals())
tagger_cfg = dict(locals())
entity_cfg = dict(locals())
lang = spacy.util.get_lang_class(language) lang = spacy.util.get_lang_class(language)
parser_cfg['features'] = lang.Defaults.parser_features
entity_cfg['features'] = lang.Defaults.entity_features
if not eval_only: if not eval_only:
gold_train = list(read_json_file(train_loc)) gold_train = list(read_json_file(train_loc))
train(lang, gold_train, model_dir, gold_dev = list(read_json_file(dev_loc))
feat_set='basic' if not debug else 'debug', train(lang, gold_train, gold_dev, model_dir, tagger_cfg, parser_cfg, entity_cfg,
gold_preproc=gold_preproc, n_sents=n_sents, n_sents=n_sents, gold_preproc=gold_preproc, corruption_level=corruption_level,
corruption_level=corruption_level, n_iter=n_iter, n_iter=n_iter)
verbose=verbose,pseudoprojective=pseudoprojective)
if out_loc: if out_loc:
write_parses(lang, dev_loc, model_dir, out_loc) write_parses(lang, dev_loc, model_dir, out_loc)
scorer = evaluate(lang, list(read_json_file(dev_loc)), scorer = evaluate(lang, list(read_json_file(dev_loc)),

View File

@ -1,3 +1,4 @@
from __future__ import unicode_literals
import plac import plac
import json import json
from os import path from os import path
@ -5,106 +6,25 @@ import shutil
import os import os
import random import random
import io import io
import pathlib
from spacy.syntax.util import Config from spacy.tokens import Doc
from spacy.syntax.nonproj import PseudoProjectivity
from spacy.language import Language
from spacy.gold import GoldParse from spacy.gold import GoldParse
from spacy.tokenizer import Tokenizer
from spacy.vocab import Vocab from spacy.vocab import Vocab
from spacy.tagger import Tagger from spacy.tagger import Tagger
from spacy.syntax.parser import Parser from spacy.pipeline import DependencyParser
from spacy.syntax.arc_eager import ArcEager
from spacy.syntax.parser import get_templates from spacy.syntax.parser import get_templates
from spacy.syntax.arc_eager import ArcEager
from spacy.scorer import Scorer from spacy.scorer import Scorer
import spacy.attrs import spacy.attrs
import io
from spacy.language import Language
from spacy.tagger import W_orth
TAGGER_TEMPLATES = (
(W_orth,),
)
try:
from codecs import open
except ImportError:
pass
class TreebankParser(object):
@staticmethod
def setup_model_dir(model_dir, labels, templates, feat_set='basic', seed=0):
dep_model_dir = path.join(model_dir, 'deps')
pos_model_dir = path.join(model_dir, 'pos')
if path.exists(dep_model_dir):
shutil.rmtree(dep_model_dir)
if path.exists(pos_model_dir):
shutil.rmtree(pos_model_dir)
os.mkdir(dep_model_dir)
os.mkdir(pos_model_dir)
Config.write(dep_model_dir, 'config', features=feat_set, seed=seed,
labels=labels)
@classmethod
def from_dir(cls, tag_map, model_dir):
vocab = Vocab(tag_map=tag_map, get_lex_attr=Language.default_lex_attrs())
vocab.get_lex_attr[spacy.attrs.LANG] = lambda _: 0
tokenizer = Tokenizer(vocab, {}, None, None, None)
tagger = Tagger.blank(vocab, TAGGER_TEMPLATES)
cfg = Config.read(path.join(model_dir, 'deps'), 'config')
parser = Parser.from_dir(path.join(model_dir, 'deps'), vocab.strings, ArcEager)
return cls(vocab, tokenizer, tagger, parser)
def __init__(self, vocab, tokenizer, tagger, parser):
self.vocab = vocab
self.tokenizer = tokenizer
self.tagger = tagger
self.parser = parser
def train(self, words, tags, heads, deps):
tokens = self.tokenizer.tokens_from_list(list(words))
self.tagger.train(tokens, tags)
tokens = self.tokenizer.tokens_from_list(list(words))
ids = range(len(words))
ner = ['O'] * len(words)
gold = GoldParse(tokens, ((ids, words, tags, heads, deps, ner)),
make_projective=False)
self.tagger(tokens)
if gold.is_projective:
try:
self.parser.train(tokens, gold)
except:
for id_, word, head, dep in zip(ids, words, heads, deps):
print(id_, word, head, dep)
raise
def __call__(self, words, tags=None):
tokens = self.tokenizer.tokens_from_list(list(words))
if tags is None:
self.tagger(tokens)
else:
self.tagger.tag_from_strings(tokens, tags)
self.parser(tokens)
return tokens
def end_training(self, data_dir):
self.parser.model.end_training()
self.parser.model.dump(path.join(data_dir, 'deps', 'model'))
self.tagger.model.end_training()
self.tagger.model.dump(path.join(data_dir, 'pos', 'model'))
strings_loc = path.join(data_dir, 'vocab', 'strings.json')
with io.open(strings_loc, 'w', encoding='utf8') as file_:
self.vocab.strings.dump(file_)
self.vocab.dump(path.join(data_dir, 'vocab', 'lexemes.bin'))
def read_conllx(loc): def read_conllx(loc):
with open(loc, 'r', 'utf8') as file_: with io.open(loc, 'r', encoding='utf8') as file_:
text = file_.read() text = file_.read()
for sent in text.strip().split('\n\n'): for sent in text.strip().split('\n\n'):
lines = sent.strip().split('\n') lines = sent.strip().split('\n')
@ -113,24 +33,31 @@ def read_conllx(loc):
lines.pop(0) lines.pop(0)
tokens = [] tokens = []
for line in lines: for line in lines:
id_, word, lemma, pos, tag, morph, head, dep, _1, _2 = line.split() id_, word, lemma, tag, pos, morph, head, dep, _1, _2 = line.split()
if '-' in id_: if '-' in id_:
continue continue
try:
id_ = int(id_) - 1 id_ = int(id_) - 1
head = (int(head) - 1) if head != '0' else id_ head = (int(head) - 1) if head != '0' else id_
dep = 'ROOT' if dep == 'root' else dep dep = 'ROOT' if dep == 'root' else dep
tokens.append((id_, word, tag, head, dep, 'O')) tokens.append((id_, word, tag, head, dep, 'O'))
tuples = zip(*tokens) except:
yield (None, [(tuples, [])]) print(line)
raise
tuples = [list(t) for t in zip(*tokens)]
yield (None, [[tuples, []]])
def score_model(nlp, gold_docs, verbose=False): def score_model(vocab, tagger, parser, gold_docs, verbose=False):
scorer = Scorer() scorer = Scorer()
for _, gold_doc in gold_docs: for _, gold_doc in gold_docs:
for annot_tuples, _ in gold_doc: for (ids, words, tags, heads, deps, entities), _ in gold_doc:
tokens = nlp(list(annot_tuples[1]), tags=list(annot_tuples[2])) doc = Doc(vocab, words=words)
gold = GoldParse(tokens, annot_tuples) tagger(doc)
scorer.score(tokens, gold, verbose=verbose) parser(doc)
PseudoProjectivity.deprojectivize(doc)
gold = GoldParse(doc, tags=tags, heads=heads, deps=deps)
scorer.score(doc, gold, verbose=verbose)
return scorer return scorer
@ -138,22 +65,45 @@ def main(train_loc, dev_loc, model_dir, tag_map_loc):
with open(tag_map_loc) as file_: with open(tag_map_loc) as file_:
tag_map = json.loads(file_.read()) tag_map = json.loads(file_.read())
train_sents = list(read_conllx(train_loc)) train_sents = list(read_conllx(train_loc))
labels = ArcEager.get_labels(train_sents) train_sents = PseudoProjectivity.preprocess_training_data(train_sents)
templates = get_templates('basic')
TreebankParser.setup_model_dir(model_dir, labels, templates) actions = ArcEager.get_actions(gold_parses=train_sents)
features = get_templates('basic')
nlp = TreebankParser.from_dir(tag_map, model_dir) model_dir = pathlib.Path(model_dir)
with (model_dir / 'deps' / 'config.json').open('w') as file_:
json.dump({'pseudoprojective': True, 'labels': actions, 'features': features}, file_)
vocab = Vocab(lex_attr_getters=Language.Defaults.lex_attr_getters, tag_map=tag_map)
# Populate vocab
for _, doc_sents in train_sents:
for (ids, words, tags, heads, deps, ner), _ in doc_sents:
for word in words:
_ = vocab[word]
for dep in deps:
_ = vocab[dep]
for tag in tags:
_ = vocab[tag]
for tag in tags:
assert tag in tag_map, repr(tag)
tagger = Tagger(vocab, tag_map=tag_map)
parser = DependencyParser(vocab, actions=actions, features=features)
for itn in range(15): for itn in range(15):
for _, doc_sents in train_sents: for _, doc_sents in train_sents:
for (ids, words, tags, heads, deps, ner), _ in doc_sents: for (ids, words, tags, heads, deps, ner), _ in doc_sents:
nlp.train(words, tags, heads, deps) doc = Doc(vocab, words=words)
gold = GoldParse(doc, tags=tags, heads=heads, deps=deps)
tagger(doc)
parser.update(doc, gold)
doc = Doc(vocab, words=words)
tagger.update(doc, gold)
random.shuffle(train_sents) random.shuffle(train_sents)
scorer = score_model(nlp, read_conllx(dev_loc)) scorer = score_model(vocab, tagger, parser, read_conllx(dev_loc))
print('%d:\t%.3f\t%.3f' % (itn, scorer.uas, scorer.tags_acc)) print('%d:\t%.3f\t%.3f' % (itn, scorer.uas, scorer.tags_acc))
nlp = Language(vocab=vocab, tagger=tagger, parser=parser)
nlp.end_training(model_dir) nlp.end_training(model_dir)
scorer = score_model(nlp, read_conllx(dev_loc)) scorer = score_model(vocab, tagger, parser, read_conllx(dev_loc))
print('%d:\t%.3f\t%.3f\t%.3f' % (itn, scorer.uas, scorer.las, scorer.tags_acc)) print('%d:\t%.3f\t%.3f\t%.3f' % (itn, scorer.uas, scorer.las, scorer.tags_acc))

View File

@ -1,95 +0,0 @@
Syllogism Contributor Agreement
===============================
This Syllogism Contributor Agreement (“SCA”) is based on the Oracle Contributor
Agreement. The SCA applies to any contribution that you make to any product or
project managed by us (the “project”), and sets out the intellectual property
rights you grant to us in the contributed materials. The term “us” shall mean
Syllogism Co. The term "you" shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested below
and include the filled-in version with your first pull-request, under the file
contrbutors/. The name of the file should be your GitHub username, with the
extension .md. For example, the user example_user would create the file
spaCy/contributors/example_user.md .
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
1. The term 'contribution' or contributed materials means any source code,
object code, patch, tool, sample, graphic, specification, manual, documentation,
or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and registrations,
in your contribution:
* you hereby assign to us joint ownership, and to the extent that such assignment
is or becomes invalid, ineffective or unenforceable, you hereby grant to us a perpetual,
irrevocable, non-exclusive, worldwide, no-charge, royalty-free, unrestricted license
to exercise all rights under those copyrights. This includes, at our option, the
right to sublicense these same rights to third parties through multiple levels of
sublicensees or other licensing arrangements;
* you agree that each of us can do all things in relation to your contribution
as if each of us were the sole owners, and if one of us makes a derivative work
of your contribution, the one who makes the derivative work (or has it made) will
be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution against
us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and exercise
all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the consent
of, pay or render an accounting to the other for any use or distribution of your
contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable, non-exclusive,
worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer your
contribution in whole or in part, alone or in combination with
or included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through multiple
levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective on
the date you first submitted a contribution to us, even if your submission took
place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of authorship
and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any third
party's copyrights, trademarks, patents, or other intellectual property rights; and
* each contribution shall be in compliance with U.S. export control laws and other
applicable export and import laws. You agree to notify us if you become aware of
any circumstance which would make any of the foregoing representations inaccurate
in any respect. Syllogism Co. may publicly disclose your participation in the project,
including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable U.S.
Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
_x__ I am signing on behalf of myself as an individual and no other person or entity, including my employer, has or will have rights with respect my contributions.
____ I am signing on behalf of my employer or a legal entity and I have the actual authority to contractually bind that entity.
| Field | Entry |
|------------------------------- | -------------------- |
| Name | J Nicolas Schrading |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-08-24 |
| GitHub username | NSchrading |
| Website (optional) | nicschrading.com |

View File

@ -1,95 +0,0 @@
Syllogism Contributor Agreement
===============================
This Syllogism Contributor Agreement (“SCA”) is based on the Oracle Contributor
Agreement. The SCA applies to any contribution that you make to any product or
project managed by us (the “project”), and sets out the intellectual property
rights you grant to us in the contributed materials. The term “us” shall mean
Syllogism Co. The term "you" shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested below
and include the filled-in version with your first pull-request, under the file
contrbutors/. The name of the file should be your GitHub username, with the
extension .md. For example, the user example_user would create the file
spaCy/contributors/example_user.md .
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
1. The term 'contribution' or contributed materials means any source code,
object code, patch, tool, sample, graphic, specification, manual, documentation,
or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and registrations,
in your contribution:
* you hereby assign to us joint ownership, and to the extent that such assignment
is or becomes invalid, ineffective or unenforceable, you hereby grant to us a perpetual,
irrevocable, non-exclusive, worldwide, no-charge, royalty-free, unrestricted license
to exercise all rights under those copyrights. This includes, at our option, the
right to sublicense these same rights to third parties through multiple levels of
sublicensees or other licensing arrangements;
* you agree that each of us can do all things in relation to your contribution
as if each of us were the sole owners, and if one of us makes a derivative work
of your contribution, the one who makes the derivative work (or has it made) will
be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution against
us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and exercise
all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the consent
of, pay or render an accounting to the other for any use or distribution of your
contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable, non-exclusive,
worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer your
contribution in whole or in part, alone or in combination with
or included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through multiple
levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective on
the date you first submitted a contribution to us, even if your submission took
place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of authorship
and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any third
party's copyrights, trademarks, patents, or other intellectual property rights; and
* each contribution shall be in compliance with U.S. export control laws and other
applicable export and import laws. You agree to notify us if you become aware of
any circumstance which would make any of the foregoing representations inaccurate
in any respect. Syllogism Co. may publicly disclose your participation in the project,
including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable U.S.
Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
x I am signing on behalf of myself as an individual and no other person or entity, including my employer, has or will have rights with respect my contributions.
____ I am signing on behalf of my employer or a legal entity and I have the actual authority to contractually bind that entity.
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Chris DuBois |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015.10.07 |
| GitHub username | chrisdubois |
| Website (optional) | |

View File

@ -1,13 +0,0 @@
Signing the Contributors License Agreement
==========================================
SpaCy is a commercial open-source project, owned by Syllogism Co. We require that contributors to SpaCy sign our Contributors License Agreement, which is based on the Oracle Contributor Agreement.
The CLA must be signed on your first pull request. To do this, simply fill in the file cla_template.md, and include the filed in form in your first pull request.
$ git clone https://github.com/honnibal/spaCy
$ cp spaCy/contributors/cla_template.md spaCy/contributors/<your GitHub username>.md
<Now fill in the file spaCy/contributors/<your GitHub username>.md>
$ git add -A spaCy/contributors/<your GitHub username>.md
Now finish your pull request, and you're done.

View File

@ -1,95 +0,0 @@
Syllogism Contributor Agreement
===============================
This Syllogism Contributor Agreement (“SCA”) is based on the Oracle Contributor
Agreement. The SCA applies to any contribution that you make to any product or
project managed by us (the “project”), and sets out the intellectual property
rights you grant to us in the contributed materials. The term “us” shall mean
Syllogism Co. The term "you" shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested below
and include the filled-in version with your first pull-request, under the file
contrbutors/. The name of the file should be your GitHub username, with the
extension .md. For example, the user example_user would create the file
spaCy/contributors/example_user.md .
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
1. The term 'contribution' or contributed materials means any source code,
object code, patch, tool, sample, graphic, specification, manual, documentation,
or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and registrations,
in your contribution:
* you hereby assign to us joint ownership, and to the extent that such assignment
is or becomes invalid, ineffective or unenforceable, you hereby grant to us a perpetual,
irrevocable, non-exclusive, worldwide, no-charge, royalty-free, unrestricted license
to exercise all rights under those copyrights. This includes, at our option, the
right to sublicense these same rights to third parties through multiple levels of
sublicensees or other licensing arrangements;
* you agree that each of us can do all things in relation to your contribution
as if each of us were the sole owners, and if one of us makes a derivative work
of your contribution, the one who makes the derivative work (or has it made) will
be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution against
us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and exercise
all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the consent
of, pay or render an accounting to the other for any use or distribution of your
contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable, non-exclusive,
worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer your
contribution in whole or in part, alone or in combination with
or included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through multiple
levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective on
the date you first submitted a contribution to us, even if your submission took
place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of authorship
and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any third
party's copyrights, trademarks, patents, or other intellectual property rights; and
* each contribution shall be in compliance with U.S. export control laws and other
applicable export and import laws. You agree to notify us if you become aware of
any circumstance which would make any of the foregoing representations inaccurate
in any respect. Syllogism Co. may publicly disclose your participation in the project,
including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable U.S.
Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
____ I am signing on behalf of myself as an individual and no other person or entity, including my employer, has or will have rights with respect my contributions.
____ I am signing on behalf of my employer or a legal entity and I have the actual authority to contractually bind that entity.
| Field | Entry |
|------------------------------- | -------------------- |
| Name | |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | |
| GitHub username | |
| Website (optional) | |

View File

@ -1,95 +0,0 @@
Syllogism Contributor Agreement
===============================
This Syllogism Contributor Agreement (“SCA”) is based on the Oracle Contributor
Agreement. The SCA applies to any contribution that you make to any product or
project managed by us (the “project”), and sets out the intellectual property
rights you grant to us in the contributed materials. The term “us” shall mean
Syllogism Co. The term "you" shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested below
and include the filled-in version with your first pull-request, under the file
contrbutors/. The name of the file should be your GitHub username, with the
extension .md. For example, the user example_user would create the file
spaCy/contributors/example_user.md .
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
1. The term 'contribution' or contributed materials means any source code,
object code, patch, tool, sample, graphic, specification, manual, documentation,
or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and registrations,
in your contribution:
* you hereby assign to us joint ownership, and to the extent that such assignment
is or becomes invalid, ineffective or unenforceable, you hereby grant to us a perpetual,
irrevocable, non-exclusive, worldwide, no-charge, royalty-free, unrestricted license
to exercise all rights under those copyrights. This includes, at our option, the
right to sublicense these same rights to third parties through multiple levels of
sublicensees or other licensing arrangements;
* you agree that each of us can do all things in relation to your contribution
as if each of us were the sole owners, and if one of us makes a derivative work
of your contribution, the one who makes the derivative work (or has it made) will
be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution against
us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and exercise
all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the consent
of, pay or render an accounting to the other for any use or distribution of your
contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable, non-exclusive,
worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer your
contribution in whole or in part, alone or in combination with
or included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through multiple
levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective on
the date you first submitted a contribution to us, even if your submission took
place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of authorship
and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any third
party's copyrights, trademarks, patents, or other intellectual property rights; and
* each contribution shall be in compliance with U.S. export control laws and other
applicable export and import laws. You agree to notify us if you become aware of
any circumstance which would make any of the foregoing representations inaccurate
in any respect. Syllogism Co. may publicly disclose your participation in the project,
including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable U.S.
Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
x___ I am signing on behalf of myself as an individual and no other person or entity, including my employer, has or will have rights with respect my contributions.
____ I am signing on behalf of my employer or a legal entity and I have the actual authority to contractually bind that entity.
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Jordan Suchow |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-04-19 |
| GitHub username | suchow |
| Website (optional) | http://suchow.io |

View File

@ -1,95 +0,0 @@
Syllogism Contributor Agreement
===============================
This Syllogism Contributor Agreement (“SCA”) is based on the Oracle Contributor
Agreement. The SCA applies to any contribution that you make to any product or
project managed by us (the “project”), and sets out the intellectual property
rights you grant to us in the contributed materials. The term “us” shall mean
Syllogism Co. The term "you" shall mean the person or entity identified below.
If you agree to be bound by these terms, fill in the information requested below
and include the filled-in version with your first pull-request, under the file
contrbutors/. The name of the file should be your GitHub username, with the
extension .md. For example, the user example_user would create the file
spaCy/contributors/example_user.md .
Read this agreement carefully before signing. These terms and conditions
constitute a binding legal agreement.
1. The term 'contribution' or contributed materials means any source code,
object code, patch, tool, sample, graphic, specification, manual, documentation,
or any other material posted or submitted by you to the project.
2. With respect to any worldwide copyrights, or copyright applications and registrations,
in your contribution:
* you hereby assign to us joint ownership, and to the extent that such assignment
is or becomes invalid, ineffective or unenforceable, you hereby grant to us a perpetual,
irrevocable, non-exclusive, worldwide, no-charge, royalty-free, unrestricted license
to exercise all rights under those copyrights. This includes, at our option, the
right to sublicense these same rights to third parties through multiple levels of
sublicensees or other licensing arrangements;
* you agree that each of us can do all things in relation to your contribution
as if each of us were the sole owners, and if one of us makes a derivative work
of your contribution, the one who makes the derivative work (or has it made) will
be the sole owner of that derivative work;
* you agree that you will not assert any moral rights in your contribution against
us, our licensees or transferees;
* you agree that we may register a copyright in your contribution and exercise
all ownership rights associated with it; and
* you agree that neither of us has any duty to consult with, obtain the consent
of, pay or render an accounting to the other for any use or distribution of your
contribution.
3. With respect to any patents you own, or that you can license without payment
to any third party, you hereby grant to us a perpetual, irrevocable, non-exclusive,
worldwide, no-charge, royalty-free license to:
* make, have made, use, sell, offer to sell, import, and otherwise transfer your
contribution in whole or in part, alone or in combination with
or included in any product, work or materials arising out of the project to
which your contribution was submitted, and
* at our option, to sublicense these same rights to third parties through multiple
levels of sublicensees or other licensing arrangements.
4. Except as set out above, you keep all right, title, and interest in your
contribution. The rights that you grant to us under these terms are effective on
the date you first submitted a contribution to us, even if your submission took
place before the date you sign these terms.
5. You covenant, represent, warrant and agree that:
* Each contribution that you submit is and shall be an original work of authorship
and you can legally grant the rights set out in this SCA;
* to the best of your knowledge, each contribution will not violate any third
party's copyrights, trademarks, patents, or other intellectual property rights; and
* each contribution shall be in compliance with U.S. export control laws and other
applicable export and import laws. You agree to notify us if you become aware of
any circumstance which would make any of the foregoing representations inaccurate
in any respect. Syllogism Co. may publicly disclose your participation in the project,
including the fact that you have signed the SCA.
6. This SCA is governed by the laws of the State of California and applicable U.S.
Federal law. Any choice of law rules will not apply.
7. Please place an “x” on one of the applicable statement below. Please do NOT
mark both statements:
_x__ I am signing on behalf of myself as an individual and no other person or entity, including my employer, has or will have rights with respect my contributions.
____ I am signing on behalf of my employer or a legal entity and I have the actual authority to contractually bind that entity.
| Field | Entry |
|------------------------------- | -------------------- |
| Name | Vsevolod Solovyov |
| Company's name (if applicable) | |
| Title or Role (if applicable) | |
| Date | 2015-08-24 |
| GitHub username | vsolovyov |
| Website (optional) | |

View File

@ -1,426 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# doc/Makefile. Generated from Makefile.in by configure.
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
srcdir = .
top_srcdir = ..
pkgdatadir = $(datadir)/WordNet
pkglibdir = $(libdir)/WordNet
pkgincludedir = $(includedir)/WordNet
top_builddir = ..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = /usr/csl/bin/install -c
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
html-recursive info-recursive install-data-recursive \
install-exec-recursive install-info-recursive \
install-recursive installcheck-recursive installdirs-recursive \
pdf-recursive ps-recursive uninstall-info-recursive \
uninstall-recursive
ETAGS = etags
CTAGS = ctags
DIST_SUBDIRS = $(SUBDIRS)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run aclocal-1.9
AMDEP_FALSE = #
AMDEP_TRUE =
AMTAR = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run tar
AUTOCONF = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoconf
AUTOHEADER = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoheader
AUTOMAKE = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run automake-1.9
AWK = nawk
CC = gcc
CCDEPMODE = depmode=gcc3
CFLAGS = -g -O2
CPP = gcc -E
CPPFLAGS =
CYGPATH_W = echo
DEFS = -DHAVE_CONFIG_H
DEPDIR = .deps
ECHO_C =
ECHO_N = -n
ECHO_T =
EGREP = egrep
EXEEXT =
INSTALL_DATA = ${INSTALL} -m 644
INSTALL_PROGRAM = ${INSTALL}
INSTALL_SCRIPT = ${INSTALL}
INSTALL_STRIP_PROGRAM = ${SHELL} $(install_sh) -c -s
LDFLAGS =
LIBOBJS =
LIBS =
LTLIBOBJS =
MAKEINFO = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run makeinfo
OBJEXT = o
PACKAGE = WordNet
PACKAGE_BUGREPORT = wordnet@princeton.edu
PACKAGE_NAME = WordNet
PACKAGE_STRING = WordNet 3.0
PACKAGE_TARNAME = wordnet
PACKAGE_VERSION = 3.0
PATH_SEPARATOR = :
RANLIB = ranlib
SET_MAKE =
SHELL = /bin/bash
STRIP =
TCL_INCLUDE_SPEC = -I/usr/csl/include
TCL_LIB_SPEC = -L/usr/csl/lib -ltcl8.4
TK_LIBS = -L/usr/openwin/lib -lX11 -ldl -lpthread -lsocket -lnsl -lm
TK_LIB_SPEC = -L/usr/csl/lib -ltk8.4
TK_PREFIX = /usr/csl
TK_XINCLUDES = -I/usr/openwin/include
VERSION = 3.0
ac_ct_CC = gcc
ac_ct_RANLIB = ranlib
ac_ct_STRIP =
ac_prefix = /usr/local/WordNet-3.0
am__fastdepCC_FALSE = #
am__fastdepCC_TRUE =
am__include = include
am__leading_dot = .
am__quote =
am__tar = ${AMTAR} chof - "$$tardir"
am__untar = ${AMTAR} xf -
bindir = ${exec_prefix}/bin
build_alias =
datadir = ${prefix}/share
exec_prefix = ${prefix}
host_alias =
includedir = ${prefix}/include
infodir = ${prefix}/info
install_sh = /people/wn/src/Release/3.0/Unix/install-sh
libdir = ${exec_prefix}/lib
libexecdir = ${exec_prefix}/libexec
localstatedir = ${prefix}/var
mandir = ${prefix}/man
mkdir_p = $(install_sh) -d
oldincludedir = /usr/include
prefix = /usr/local/WordNet-3.0
program_transform_name = s,x,x,
sbindir = ${exec_prefix}/sbin
sharedstatedir = ${prefix}/com
sysconfdir = ${prefix}/etc
target_alias =
SUBDIRS = html man pdf ps
all: all-recursive
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
# This directory's subdirectories are mostly independent; you can cd
# into them and run `make' without going through this Makefile.
# To change the values of `make' variables: instead of editing Makefiles,
# (1) if the variable is set in `config.status', edit `config.status'
# (which will cause the Makefiles to be regenerated when you run `make');
# (2) otherwise, pass the desired values on the `make' command line.
$(RECURSIVE_TARGETS):
@set fnord $$MAKEFLAGS; amf=$$2; \
dot_seen=no; \
target=`echo $@ | sed s/-recursive//`; \
list='$(SUBDIRS)'; for subdir in $$list; do \
echo "Making $$target in $$subdir"; \
if test "$$subdir" = "."; then \
dot_seen=yes; \
local_target="$$target-am"; \
else \
local_target="$$target"; \
fi; \
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|| case "$$amf" in *=*) exit 1;; *k*) fail=yes;; *) exit 1;; esac; \
done; \
if test "$$dot_seen" = "no"; then \
$(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
fi; test -z "$$fail"
mostlyclean-recursive clean-recursive distclean-recursive \
maintainer-clean-recursive:
@set fnord $$MAKEFLAGS; amf=$$2; \
dot_seen=no; \
case "$@" in \
distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
*) list='$(SUBDIRS)' ;; \
esac; \
rev=''; for subdir in $$list; do \
if test "$$subdir" = "."; then :; else \
rev="$$subdir $$rev"; \
fi; \
done; \
rev="$$rev ."; \
target=`echo $@ | sed s/-recursive//`; \
for subdir in $$rev; do \
echo "Making $$target in $$subdir"; \
if test "$$subdir" = "."; then \
local_target="$$target-am"; \
else \
local_target="$$target"; \
fi; \
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|| case "$$amf" in *=*) exit 1;; *k*) fail=yes;; *) exit 1;; esac; \
done && test -z "$$fail"
tags-recursive:
list='$(SUBDIRS)'; for subdir in $$list; do \
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
done
ctags-recursive:
list='$(SUBDIRS)'; for subdir in $$list; do \
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
done
ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
mkid -fID $$unique
tags: TAGS
TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
$(TAGS_FILES) $(LISP)
tags=; \
here=`pwd`; \
if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
include_option=--etags-include; \
empty_fix=.; \
else \
include_option=--include; \
empty_fix=; \
fi; \
list='$(SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
test ! -f $$subdir/TAGS || \
tags="$$tags $$include_option=$$here/$$subdir/TAGS"; \
fi; \
done; \
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
if test -z "$(ETAGS_ARGS)$$tags$$unique"; then :; else \
test -n "$$unique" || unique=$$empty_fix; \
$(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
$$tags $$unique; \
fi
ctags: CTAGS
CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
$(TAGS_FILES) $(LISP)
tags=; \
here=`pwd`; \
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
test -z "$(CTAGS_ARGS)$$tags$$unique" \
|| $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
$$tags $$unique
GTAGS:
here=`$(am__cd) $(top_builddir) && pwd` \
&& cd $(top_srcdir) \
&& gtags -i $(GTAGS_ARGS) $$here
distclean-tags:
-rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
test -d "$(distdir)/$$subdir" \
|| $(mkdir_p) "$(distdir)/$$subdir" \
|| exit 1; \
distdir=`$(am__cd) $(distdir) && pwd`; \
top_distdir=`$(am__cd) $(top_distdir) && pwd`; \
(cd $$subdir && \
$(MAKE) $(AM_MAKEFLAGS) \
top_distdir="$$top_distdir" \
distdir="$$distdir/$$subdir" \
distdir) \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-recursive
all-am: Makefile
installdirs: installdirs-recursive
installdirs-am:
install: install-recursive
install-exec: install-exec-recursive
install-data: install-data-recursive
uninstall: uninstall-recursive
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-recursive
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-recursive
clean-am: clean-generic mostlyclean-am
distclean: distclean-recursive
-rm -f Makefile
distclean-am: clean-am distclean-generic distclean-tags
dvi: dvi-recursive
dvi-am:
html: html-recursive
info: info-recursive
info-am:
install-data-am:
install-exec-am:
install-info: install-info-recursive
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-recursive
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-recursive
mostlyclean-am: mostlyclean-generic
pdf: pdf-recursive
pdf-am:
ps: ps-recursive
ps-am:
uninstall-am: uninstall-info-am
uninstall-info: uninstall-info-recursive
.PHONY: $(RECURSIVE_TARGETS) CTAGS GTAGS all all-am check check-am \
clean clean-generic clean-recursive ctags ctags-recursive \
distclean distclean-generic distclean-recursive distclean-tags \
distdir dvi dvi-am html html-am info info-am install \
install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-strip installcheck installcheck-am installdirs \
installdirs-am maintainer-clean maintainer-clean-generic \
maintainer-clean-recursive mostlyclean mostlyclean-generic \
mostlyclean-recursive pdf pdf-am ps ps-am tags tags-recursive \
uninstall uninstall-am uninstall-info-am
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1 +0,0 @@
SUBDIRS = html man pdf ps

View File

@ -1,426 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# @configure_input@
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
@SET_MAKE@
srcdir = @srcdir@
top_srcdir = @top_srcdir@
VPATH = @srcdir@
pkgdatadir = $(datadir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
top_builddir = ..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = @INSTALL@
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
RECURSIVE_TARGETS = all-recursive check-recursive dvi-recursive \
html-recursive info-recursive install-data-recursive \
install-exec-recursive install-info-recursive \
install-recursive installcheck-recursive installdirs-recursive \
pdf-recursive ps-recursive uninstall-info-recursive \
uninstall-recursive
ETAGS = etags
CTAGS = ctags
DIST_SUBDIRS = $(SUBDIRS)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
AMDEP_FALSE = @AMDEP_FALSE@
AMDEP_TRUE = @AMDEP_TRUE@
AMTAR = @AMTAR@
AUTOCONF = @AUTOCONF@
AUTOHEADER = @AUTOHEADER@
AUTOMAKE = @AUTOMAKE@
AWK = @AWK@
CC = @CC@
CCDEPMODE = @CCDEPMODE@
CFLAGS = @CFLAGS@
CPP = @CPP@
CPPFLAGS = @CPPFLAGS@
CYGPATH_W = @CYGPATH_W@
DEFS = @DEFS@
DEPDIR = @DEPDIR@
ECHO_C = @ECHO_C@
ECHO_N = @ECHO_N@
ECHO_T = @ECHO_T@
EGREP = @EGREP@
EXEEXT = @EXEEXT@
INSTALL_DATA = @INSTALL_DATA@
INSTALL_PROGRAM = @INSTALL_PROGRAM@
INSTALL_SCRIPT = @INSTALL_SCRIPT@
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
LDFLAGS = @LDFLAGS@
LIBOBJS = @LIBOBJS@
LIBS = @LIBS@
LTLIBOBJS = @LTLIBOBJS@
MAKEINFO = @MAKEINFO@
OBJEXT = @OBJEXT@
PACKAGE = @PACKAGE@
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
PACKAGE_NAME = @PACKAGE_NAME@
PACKAGE_STRING = @PACKAGE_STRING@
PACKAGE_TARNAME = @PACKAGE_TARNAME@
PACKAGE_VERSION = @PACKAGE_VERSION@
PATH_SEPARATOR = @PATH_SEPARATOR@
RANLIB = @RANLIB@
SET_MAKE = @SET_MAKE@
SHELL = @SHELL@
STRIP = @STRIP@
TCL_INCLUDE_SPEC = @TCL_INCLUDE_SPEC@
TCL_LIB_SPEC = @TCL_LIB_SPEC@
TK_LIBS = @TK_LIBS@
TK_LIB_SPEC = @TK_LIB_SPEC@
TK_PREFIX = @TK_PREFIX@
TK_XINCLUDES = @TK_XINCLUDES@
VERSION = @VERSION@
ac_ct_CC = @ac_ct_CC@
ac_ct_RANLIB = @ac_ct_RANLIB@
ac_ct_STRIP = @ac_ct_STRIP@
ac_prefix = @ac_prefix@
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
am__include = @am__include@
am__leading_dot = @am__leading_dot@
am__quote = @am__quote@
am__tar = @am__tar@
am__untar = @am__untar@
bindir = @bindir@
build_alias = @build_alias@
datadir = @datadir@
exec_prefix = @exec_prefix@
host_alias = @host_alias@
includedir = @includedir@
infodir = @infodir@
install_sh = @install_sh@
libdir = @libdir@
libexecdir = @libexecdir@
localstatedir = @localstatedir@
mandir = @mandir@
mkdir_p = @mkdir_p@
oldincludedir = @oldincludedir@
prefix = @prefix@
program_transform_name = @program_transform_name@
sbindir = @sbindir@
sharedstatedir = @sharedstatedir@
sysconfdir = @sysconfdir@
target_alias = @target_alias@
SUBDIRS = html man pdf ps
all: all-recursive
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
# This directory's subdirectories are mostly independent; you can cd
# into them and run `make' without going through this Makefile.
# To change the values of `make' variables: instead of editing Makefiles,
# (1) if the variable is set in `config.status', edit `config.status'
# (which will cause the Makefiles to be regenerated when you run `make');
# (2) otherwise, pass the desired values on the `make' command line.
$(RECURSIVE_TARGETS):
@set fnord $$MAKEFLAGS; amf=$$2; \
dot_seen=no; \
target=`echo $@ | sed s/-recursive//`; \
list='$(SUBDIRS)'; for subdir in $$list; do \
echo "Making $$target in $$subdir"; \
if test "$$subdir" = "."; then \
dot_seen=yes; \
local_target="$$target-am"; \
else \
local_target="$$target"; \
fi; \
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|| case "$$amf" in *=*) exit 1;; *k*) fail=yes;; *) exit 1;; esac; \
done; \
if test "$$dot_seen" = "no"; then \
$(MAKE) $(AM_MAKEFLAGS) "$$target-am" || exit 1; \
fi; test -z "$$fail"
mostlyclean-recursive clean-recursive distclean-recursive \
maintainer-clean-recursive:
@set fnord $$MAKEFLAGS; amf=$$2; \
dot_seen=no; \
case "$@" in \
distclean-* | maintainer-clean-*) list='$(DIST_SUBDIRS)' ;; \
*) list='$(SUBDIRS)' ;; \
esac; \
rev=''; for subdir in $$list; do \
if test "$$subdir" = "."; then :; else \
rev="$$subdir $$rev"; \
fi; \
done; \
rev="$$rev ."; \
target=`echo $@ | sed s/-recursive//`; \
for subdir in $$rev; do \
echo "Making $$target in $$subdir"; \
if test "$$subdir" = "."; then \
local_target="$$target-am"; \
else \
local_target="$$target"; \
fi; \
(cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) $$local_target) \
|| case "$$amf" in *=*) exit 1;; *k*) fail=yes;; *) exit 1;; esac; \
done && test -z "$$fail"
tags-recursive:
list='$(SUBDIRS)'; for subdir in $$list; do \
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) tags); \
done
ctags-recursive:
list='$(SUBDIRS)'; for subdir in $$list; do \
test "$$subdir" = . || (cd $$subdir && $(MAKE) $(AM_MAKEFLAGS) ctags); \
done
ID: $(HEADERS) $(SOURCES) $(LISP) $(TAGS_FILES)
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
mkid -fID $$unique
tags: TAGS
TAGS: tags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
$(TAGS_FILES) $(LISP)
tags=; \
here=`pwd`; \
if ($(ETAGS) --etags-include --version) >/dev/null 2>&1; then \
include_option=--etags-include; \
empty_fix=.; \
else \
include_option=--include; \
empty_fix=; \
fi; \
list='$(SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
test ! -f $$subdir/TAGS || \
tags="$$tags $$include_option=$$here/$$subdir/TAGS"; \
fi; \
done; \
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
if test -z "$(ETAGS_ARGS)$$tags$$unique"; then :; else \
test -n "$$unique" || unique=$$empty_fix; \
$(ETAGS) $(ETAGSFLAGS) $(AM_ETAGSFLAGS) $(ETAGS_ARGS) \
$$tags $$unique; \
fi
ctags: CTAGS
CTAGS: ctags-recursive $(HEADERS) $(SOURCES) $(TAGS_DEPENDENCIES) \
$(TAGS_FILES) $(LISP)
tags=; \
here=`pwd`; \
list='$(SOURCES) $(HEADERS) $(LISP) $(TAGS_FILES)'; \
unique=`for i in $$list; do \
if test -f "$$i"; then echo $$i; else echo $(srcdir)/$$i; fi; \
done | \
$(AWK) ' { files[$$0] = 1; } \
END { for (i in files) print i; }'`; \
test -z "$(CTAGS_ARGS)$$tags$$unique" \
|| $(CTAGS) $(CTAGSFLAGS) $(AM_CTAGSFLAGS) $(CTAGS_ARGS) \
$$tags $$unique
GTAGS:
here=`$(am__cd) $(top_builddir) && pwd` \
&& cd $(top_srcdir) \
&& gtags -i $(GTAGS_ARGS) $$here
distclean-tags:
-rm -f TAGS ID GTAGS GRTAGS GSYMS GPATH tags
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
list='$(DIST_SUBDIRS)'; for subdir in $$list; do \
if test "$$subdir" = .; then :; else \
test -d "$(distdir)/$$subdir" \
|| $(mkdir_p) "$(distdir)/$$subdir" \
|| exit 1; \
distdir=`$(am__cd) $(distdir) && pwd`; \
top_distdir=`$(am__cd) $(top_distdir) && pwd`; \
(cd $$subdir && \
$(MAKE) $(AM_MAKEFLAGS) \
top_distdir="$$top_distdir" \
distdir="$$distdir/$$subdir" \
distdir) \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-recursive
all-am: Makefile
installdirs: installdirs-recursive
installdirs-am:
install: install-recursive
install-exec: install-exec-recursive
install-data: install-data-recursive
uninstall: uninstall-recursive
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-recursive
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-recursive
clean-am: clean-generic mostlyclean-am
distclean: distclean-recursive
-rm -f Makefile
distclean-am: clean-am distclean-generic distclean-tags
dvi: dvi-recursive
dvi-am:
html: html-recursive
info: info-recursive
info-am:
install-data-am:
install-exec-am:
install-info: install-info-recursive
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-recursive
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-recursive
mostlyclean-am: mostlyclean-generic
pdf: pdf-recursive
pdf-am:
ps: ps-recursive
ps-am:
uninstall-am: uninstall-info-am
uninstall-info: uninstall-info-recursive
.PHONY: $(RECURSIVE_TARGETS) CTAGS GTAGS all all-am check check-am \
clean clean-generic clean-recursive ctags ctags-recursive \
distclean distclean-generic distclean-recursive distclean-tags \
distdir dvi dvi-am html html-am info info-am install \
install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-strip installcheck installcheck-am installdirs \
installdirs-am maintainer-clean maintainer-clean-generic \
maintainer-clean-recursive mostlyclean mostlyclean-generic \
mostlyclean-recursive pdf pdf-am ps ps-am tags tags-recursive \
uninstall uninstall-am uninstall-info-am
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1,313 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# doc/html/Makefile. Generated from Makefile.in by configure.
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
srcdir = .
top_srcdir = ../..
pkgdatadir = $(datadir)/WordNet
pkglibdir = $(libdir)/WordNet
pkgincludedir = $(includedir)/WordNet
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = /usr/csl/bin/install -c
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/html
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
$(srcdir)/*) f=`echo "$$p" | sed "s|^$$srcdirstrip/||"`;; \
*) f=$$p;; \
esac;
am__strip_dir = `echo $$p | sed -e 's|^.*/||'`;
am__installdirs = "$(DESTDIR)$(htmldir)"
htmlDATA_INSTALL = $(INSTALL_DATA)
DATA = $(html_DATA)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run aclocal-1.9
AMDEP_FALSE = #
AMDEP_TRUE =
AMTAR = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run tar
AUTOCONF = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoconf
AUTOHEADER = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoheader
AUTOMAKE = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run automake-1.9
AWK = nawk
CC = gcc
CCDEPMODE = depmode=gcc3
CFLAGS = -g -O2
CPP = gcc -E
CPPFLAGS =
CYGPATH_W = echo
DEFS = -DHAVE_CONFIG_H
DEPDIR = .deps
ECHO_C =
ECHO_N = -n
ECHO_T =
EGREP = egrep
EXEEXT =
INSTALL_DATA = ${INSTALL} -m 644
INSTALL_PROGRAM = ${INSTALL}
INSTALL_SCRIPT = ${INSTALL}
INSTALL_STRIP_PROGRAM = ${SHELL} $(install_sh) -c -s
LDFLAGS =
LIBOBJS =
LIBS =
LTLIBOBJS =
MAKEINFO = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run makeinfo
OBJEXT = o
PACKAGE = WordNet
PACKAGE_BUGREPORT = wordnet@princeton.edu
PACKAGE_NAME = WordNet
PACKAGE_STRING = WordNet 3.0
PACKAGE_TARNAME = wordnet
PACKAGE_VERSION = 3.0
PATH_SEPARATOR = :
RANLIB = ranlib
SET_MAKE =
SHELL = /bin/bash
STRIP =
TCL_INCLUDE_SPEC = -I/usr/csl/include
TCL_LIB_SPEC = -L/usr/csl/lib -ltcl8.4
TK_LIBS = -L/usr/openwin/lib -lX11 -ldl -lpthread -lsocket -lnsl -lm
TK_LIB_SPEC = -L/usr/csl/lib -ltk8.4
TK_PREFIX = /usr/csl
TK_XINCLUDES = -I/usr/openwin/include
VERSION = 3.0
ac_ct_CC = gcc
ac_ct_RANLIB = ranlib
ac_ct_STRIP =
ac_prefix = /usr/local/WordNet-3.0
am__fastdepCC_FALSE = #
am__fastdepCC_TRUE =
am__include = include
am__leading_dot = .
am__quote =
am__tar = ${AMTAR} chof - "$$tardir"
am__untar = ${AMTAR} xf -
bindir = ${exec_prefix}/bin
build_alias =
datadir = ${prefix}/share
exec_prefix = ${prefix}
host_alias =
includedir = ${prefix}/include
infodir = ${prefix}/info
install_sh = /people/wn/src/Release/3.0/Unix/install-sh
libdir = ${exec_prefix}/lib
libexecdir = ${exec_prefix}/libexec
localstatedir = ${prefix}/var
mandir = ${prefix}/man
mkdir_p = $(install_sh) -d
oldincludedir = /usr/include
prefix = /usr/local/WordNet-3.0
program_transform_name = s,x,x,
sbindir = ${exec_prefix}/sbin
sharedstatedir = ${prefix}/com
sysconfdir = ${prefix}/etc
target_alias =
htmldir = $(prefix)/doc/html
html_DATA = binsrch.3WN.html cntlist.5WN.html grind.1WN.html lexnames.5WN.html morph.3WN.html morphy.7WN.html senseidx.5WN.html uniqbeg.7WN.html wn.1WN.html wnb.1WN.html wndb.5WN.html wngloss.7WN.html wngroups.7WN.html wninput.5WN.html wnintro.1WN.html wnintro.3WN.html wnintro.5WN.html wnintro.7WN.html wnlicens.7WN.html wnpkgs.7WN.html wnsearch.3WN.html wnstats.7WN.html wnutil.3WN.html
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/html/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/html/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-htmlDATA: $(html_DATA)
@$(NORMAL_INSTALL)
test -z "$(htmldir)" || $(mkdir_p) "$(DESTDIR)$(htmldir)"
@list='$(html_DATA)'; for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
f=$(am__strip_dir) \
echo " $(htmlDATA_INSTALL) '$$d$$p' '$(DESTDIR)$(htmldir)/$$f'"; \
$(htmlDATA_INSTALL) "$$d$$p" "$(DESTDIR)$(htmldir)/$$f"; \
done
uninstall-htmlDATA:
@$(NORMAL_UNINSTALL)
@list='$(html_DATA)'; for p in $$list; do \
f=$(am__strip_dir) \
echo " rm -f '$(DESTDIR)$(htmldir)/$$f'"; \
rm -f "$(DESTDIR)$(htmldir)/$$f"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(DATA)
installdirs:
for dir in "$(DESTDIR)$(htmldir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-htmlDATA
install-exec-am:
install-info: install-info-am
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-htmlDATA uninstall-info-am
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-htmlDATA install-info install-info-am \
install-man install-strip installcheck installcheck-am \
installdirs maintainer-clean maintainer-clean-generic \
mostlyclean mostlyclean-generic pdf pdf-am ps ps-am uninstall \
uninstall-am uninstall-htmlDATA uninstall-info-am
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1,2 +0,0 @@
htmldir = $(prefix)/doc/html
html_DATA = binsrch.3WN.html cntlist.5WN.html grind.1WN.html lexnames.5WN.html morph.3WN.html morphy.7WN.html senseidx.5WN.html uniqbeg.7WN.html wn.1WN.html wnb.1WN.html wndb.5WN.html wngloss.7WN.html wngroups.7WN.html wninput.5WN.html wnintro.1WN.html wnintro.3WN.html wnintro.5WN.html wnintro.7WN.html wnlicens.7WN.html wnpkgs.7WN.html wnsearch.3WN.html wnstats.7WN.html wnutil.3WN.html

View File

@ -1,313 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# @configure_input@
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
@SET_MAKE@
srcdir = @srcdir@
top_srcdir = @top_srcdir@
VPATH = @srcdir@
pkgdatadir = $(datadir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = @INSTALL@
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/html
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
$(srcdir)/*) f=`echo "$$p" | sed "s|^$$srcdirstrip/||"`;; \
*) f=$$p;; \
esac;
am__strip_dir = `echo $$p | sed -e 's|^.*/||'`;
am__installdirs = "$(DESTDIR)$(htmldir)"
htmlDATA_INSTALL = $(INSTALL_DATA)
DATA = $(html_DATA)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
AMDEP_FALSE = @AMDEP_FALSE@
AMDEP_TRUE = @AMDEP_TRUE@
AMTAR = @AMTAR@
AUTOCONF = @AUTOCONF@
AUTOHEADER = @AUTOHEADER@
AUTOMAKE = @AUTOMAKE@
AWK = @AWK@
CC = @CC@
CCDEPMODE = @CCDEPMODE@
CFLAGS = @CFLAGS@
CPP = @CPP@
CPPFLAGS = @CPPFLAGS@
CYGPATH_W = @CYGPATH_W@
DEFS = @DEFS@
DEPDIR = @DEPDIR@
ECHO_C = @ECHO_C@
ECHO_N = @ECHO_N@
ECHO_T = @ECHO_T@
EGREP = @EGREP@
EXEEXT = @EXEEXT@
INSTALL_DATA = @INSTALL_DATA@
INSTALL_PROGRAM = @INSTALL_PROGRAM@
INSTALL_SCRIPT = @INSTALL_SCRIPT@
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
LDFLAGS = @LDFLAGS@
LIBOBJS = @LIBOBJS@
LIBS = @LIBS@
LTLIBOBJS = @LTLIBOBJS@
MAKEINFO = @MAKEINFO@
OBJEXT = @OBJEXT@
PACKAGE = @PACKAGE@
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
PACKAGE_NAME = @PACKAGE_NAME@
PACKAGE_STRING = @PACKAGE_STRING@
PACKAGE_TARNAME = @PACKAGE_TARNAME@
PACKAGE_VERSION = @PACKAGE_VERSION@
PATH_SEPARATOR = @PATH_SEPARATOR@
RANLIB = @RANLIB@
SET_MAKE = @SET_MAKE@
SHELL = @SHELL@
STRIP = @STRIP@
TCL_INCLUDE_SPEC = @TCL_INCLUDE_SPEC@
TCL_LIB_SPEC = @TCL_LIB_SPEC@
TK_LIBS = @TK_LIBS@
TK_LIB_SPEC = @TK_LIB_SPEC@
TK_PREFIX = @TK_PREFIX@
TK_XINCLUDES = @TK_XINCLUDES@
VERSION = @VERSION@
ac_ct_CC = @ac_ct_CC@
ac_ct_RANLIB = @ac_ct_RANLIB@
ac_ct_STRIP = @ac_ct_STRIP@
ac_prefix = @ac_prefix@
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
am__include = @am__include@
am__leading_dot = @am__leading_dot@
am__quote = @am__quote@
am__tar = @am__tar@
am__untar = @am__untar@
bindir = @bindir@
build_alias = @build_alias@
datadir = @datadir@
exec_prefix = @exec_prefix@
host_alias = @host_alias@
includedir = @includedir@
infodir = @infodir@
install_sh = @install_sh@
libdir = @libdir@
libexecdir = @libexecdir@
localstatedir = @localstatedir@
mandir = @mandir@
mkdir_p = @mkdir_p@
oldincludedir = @oldincludedir@
prefix = @prefix@
program_transform_name = @program_transform_name@
sbindir = @sbindir@
sharedstatedir = @sharedstatedir@
sysconfdir = @sysconfdir@
target_alias = @target_alias@
htmldir = $(prefix)/doc/html
html_DATA = binsrch.3WN.html cntlist.5WN.html grind.1WN.html lexnames.5WN.html morph.3WN.html morphy.7WN.html senseidx.5WN.html uniqbeg.7WN.html wn.1WN.html wnb.1WN.html wndb.5WN.html wngloss.7WN.html wngroups.7WN.html wninput.5WN.html wnintro.1WN.html wnintro.3WN.html wnintro.5WN.html wnintro.7WN.html wnlicens.7WN.html wnpkgs.7WN.html wnsearch.3WN.html wnstats.7WN.html wnutil.3WN.html
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/html/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/html/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-htmlDATA: $(html_DATA)
@$(NORMAL_INSTALL)
test -z "$(htmldir)" || $(mkdir_p) "$(DESTDIR)$(htmldir)"
@list='$(html_DATA)'; for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
f=$(am__strip_dir) \
echo " $(htmlDATA_INSTALL) '$$d$$p' '$(DESTDIR)$(htmldir)/$$f'"; \
$(htmlDATA_INSTALL) "$$d$$p" "$(DESTDIR)$(htmldir)/$$f"; \
done
uninstall-htmlDATA:
@$(NORMAL_UNINSTALL)
@list='$(html_DATA)'; for p in $$list; do \
f=$(am__strip_dir) \
echo " rm -f '$(DESTDIR)$(htmldir)/$$f'"; \
rm -f "$(DESTDIR)$(htmldir)/$$f"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(DATA)
installdirs:
for dir in "$(DESTDIR)$(htmldir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-htmlDATA
install-exec-am:
install-info: install-info-am
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-htmlDATA uninstall-info-am
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-htmlDATA install-info install-info-am \
install-man install-strip installcheck installcheck-am \
installdirs maintainer-clean maintainer-clean-generic \
mostlyclean mostlyclean-generic pdf pdf-am ps ps-am uninstall \
uninstall-am uninstall-htmlDATA uninstall-info-am
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1,78 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>BINSRCH(3WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
bin_search, copyfile, replace_line, insert_line
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>char *bin_search(char
*key, FILE *fp); </B> <P>
<B>void copyfile(FILE *fromfp, FILE *tofp); </B> <P>
<B>char *replace_line(char
*new_line, char *key, FILE *fp); </B> <P>
<B>char *insert_line(char *new_line, char
*key, FILE *fp); </B>
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
<P>
The WordNet library contains several general
purpose functions for performing a binary search and modifying sorted
files. <P>
<B>bin_search()</B> is the primary binary search algorithm to search for
<I>key </I> as the first item on a line in the file pointed to by <I>fp </I>. The delimiter
between the key and the rest of the fields on the line, if any, must be
a space. A pointer to a static variable containing the entire line is
returned. <FONT SIZE=-1><B>NULL </B></FONT>
is returned if a match is not found. <P>
The remaining functions
are not used by WordNet, and are only briefly described. <P>
<B>copyfile()</B> copies
the contents of one file to another. <P>
<B>replace_line()</B> replaces a line in
a file having searchkey <I>key </I> with the contents of <I>new_line </I>. It returns
the original line or <FONT SIZE=-1><B>NULL </B></FONT>
in case of error. <P>
<B>insert_line()</B> finds the proper
place to insert the contents of <I>new_line </I>, having searchkey <I>key </I> in the
sorted file pointed to by <I>fp </I>. It returns <FONT SIZE=-1><B>NULL </B></FONT>
if a line with this searchkey
is already in the file.
<H2><A NAME="sect3" HREF="#toc3">NOTES </A></H2>
The maximum length of <I>key </I> is 1024. <P>
The
maximum line length in a file is 25K. <P>
If there are no additional fields
after the search key, the key must be followed by at least one space before
the newline character.
<H2><A NAME="sect4" HREF="#toc4">SEE ALSO </A></H2>
<B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
, <B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
,
<B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
.
<H2><A NAME="sect5" HREF="#toc5">WARNINGS </A></H2>
<B>binsearch() </B> returns a pointer to
a static character buffer. The returned string should be copied by the
caller if the results need to be saved, as a subsequent call will replace
the contents of the static buffer. <P>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">NOTES</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEE ALSO</A></LI>
<LI><A NAME="toc5" HREF="#sect5">WARNINGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,125 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>CNTLIST(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
cntlist - file listing number of times each tagged sense occurs
in a semantic concordance, sorted most to least frequently tagged <P>
cntlist.rev
- file listing number of times each tagged sense occurs in a semantic concordance,
sorted by sense key
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
A cntlist file for a semantic concordance
lists the number of times each semantically tagged sense occurs in the
concordance and its sense number in the WordNet database. Each line in
the file corresponds to a sense in the WordNet database to which at least
one semantic tag points. Only senses that are tagged in a concordance
are in the concordance's cntlist file. <P>
<H3><A NAME="sect2" HREF="#toc2">WordNet Database <I>cntlist </I> File
</A></H3>
In the WordNet database, words are assigned sense numbers based on frequency
of use in semantically tagged corpora. The cntlist file used by <B><A HREF="grind.1WN.html">grind</B>(1WN)<B></B></A>
to build the WordNet database and assign the sense numbers is a union
of the cntlist files from the various semantic concordances that were
formerly released by Princeton University. This combined cntlist file
is provided with the WordNet package and is found in the <B>WNSEARCHDIR </B>
directory. <P>
The <I>cntlist.rev </I> file is used at run-time by the WordNet library
code and browser interfaces to print in the output display the number
of times each sense has been tagged.
<H3><A NAME="sect3" HREF="#toc3">File Format </A></H3>
Each line in a cntlist
file contains information for one sense. The file is ordered from most
to least frequently tagged sense. The fields are separated by one space,
and each line is terminated with a newline character. Senses having the
same <I>tag_cnt </I> value are listed in reverse alphabetical order of the <I>lemma
</I> field of the <I>sense_key </I>. <P>
Each line in <B>cntlist </B> is of the form: <P>
<blockquote><I>tag_cnt&nbsp;&nbsp;sense_key&nbsp;&nbsp;sense_number
</I> </blockquote>
<P>
where <I>tag_cnt </I> is the decimal number of times the sense is tagged in
the corresponding semantic concordance. <I>sense_key </I> is a WordNet sense
encoding and <I>sense_number </I> is a WordNet sense number as described in <P>
The <I>cntlist.rev </I> file contains the same fields described above, in the
following order: <P>
<blockquote><I>sense_key&nbsp;&nbsp;sense_number&nbsp;&nbsp;tag_cnt </I> </blockquote>
<P>
<H2><A NAME="sect4" HREF="#toc4">NOTES </A></H2>
Princeton
no longer maintains or releases the Semantic Concordance files. The <I>cntlist
</I> file used to order the senses in WordNet 3.0 was generated from the Semantic
Concordance files at the point that they were last updated in 2001. In
general, the order of senses presented usually reflects what the user
would expect, however sense ordering is now less reliable than in prior
releases and should not be construed as an accurate indicator of frequency
of use.
<H2><A NAME="sect5" HREF="#toc5">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet.
Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the
WordNet database has been installed. Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect6" HREF="#toc6">REGISTRY
(WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory for
WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
<DT><B>HKEY_CURRENT_USER\SOFTWARE\WordNet\3.0\wnres</B>
</DT>
<DD>User's default browser options. </DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">FILES </A></H2>
<DL>
<DT><B>cntlist, cntlist.rev</B> </DT>
<DD>file of combined
semantic concordance <B>cntlist </B> files. Used to assign sense numbers in WordNet
database </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">SEE ALSO </A></H2>
<B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">WordNet Database cntlist File</A></LI>
<LI><A NAME="toc3" HREF="#sect3">File Format</A></LI>
</UL>
<LI><A NAME="toc4" HREF="#sect4">NOTES</A></LI>
<LI><A NAME="toc5" HREF="#sect5">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc6" HREF="#sect6">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc7" HREF="#sect7">FILES</A></LI>
<LI><A NAME="toc8" HREF="#sect8">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,195 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>GRIND(1) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
grind - process WordNet lexicographer files
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<B>grind </B> [ <B>-v
</B> ] [ <B>-s </B> ] [ <B>-L </B><I>logfile </I> ] [ <B>-a </B> ] [ <B>-d </B> ] [ <B>-i </B> ] [ <B>-o </B> ] [ <B>-n </B> ] <I>filename </I>
[ <I>filename </I>... ]
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
<B>grind() </B> processes WordNet lexicographer files,
producing database files suitable for use with the WordNet search and
interface code and other applications. The syntactic and structural integrity
of the input files is verified. Warnings and errors are reported via <B>stderr
</B> and a run-time log is produced on <B>stdout </B>. A database is generated only
if there are no errors.
<H3><A NAME="sect3" HREF="#toc3">Input Files </A></H3>
Input files correspond to the syntactic
categories implemented in WordNet - <B>noun</B>, <B></B> <B>verb</B>, <B></B> <B>adjective</B> and <B></B> <B>adverb</B>.
Each input lexicographer file consists of a list of synonym sets (<I>synsets
</I>) for one part of speech. Although the basic synset syntax is the same
for all of the parts of speech, some parts of the syntax only apply to
a particular part of speech. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)<B></B></A>
for a description of the
input file format. <P>
Each <I>filename </I> specified is of the form: <P>
<blockquote> </blockquote>
<P>
where
<I>pathname </I> is optional and <I>pos </I> is either <B>noun</B>, <B></B> <B>verb</B>, <B></B> <B>adj</B> or <B></B> <B>adv</B>. <I>suffix
</I> may be used to separate groups of synsets into different files, for example
<B>noun.animal </B> and <B>noun.plant </B>. One or more input files, in any combination
of syntactic categories, may be specified. See <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
for a list
of the lexicographer files used to build the complete WordNet database.
<H3><A NAME="sect4" HREF="#toc4">Output Files </A></H3>
<B>grind() </B> produces the following output files: <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>Filename
</B></TD> <TD ALIGN=CENTER>Description </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT><B>index.<I>pos </I></B> </TD> <TD ALIGN=LEFT>Index file for each syntactic category </TD>
</TR>
<TR> <TD ALIGN=LEFT><B>data.<I>pos </I></B> </TD> <TD ALIGN=LEFT>Data file for each syntactic category </TD> </TR>
<TR> <TD ALIGN=LEFT><B>index.sense </B> </TD> <TD ALIGN=LEFT>Sense
index </TD> </TR>
</TABLE>
<P>
See <B><A HREF="wndb.5WN.html">wndb</B>(5WN)<B></B></A>
for a description of the database file formats.
<P>
Each time <B>grind() </B> is run, any existing database files are overwritten
with the database files generated from the specified input files. If no
input files from a syntactic category are specified, the corresponding
database files are not overwritten.
<H3><A NAME="sect5" HREF="#toc5">Sense Numbers </A></H3>
Senses are generally
ordered from most to least frequently used, with the most common sense
numbered <B>1 </B>. Frequency of use is determined by the number of times a sense
is tagged in the various semantic concordance texts. Senses that are not
semantically tagged follow the ordered senses in an arbitrary order.
Note that this ordering is only an estimate based on usage in a small
corpus. <P>
The <I>tagsense_cnt </I> field for each entry in the <B>index.<I>pos </I></B> files
indicates how many of the senses in the list have been tagged. <P>
The <B>cntlist
</B> file provided with the database lists the number of times each sense
is tagged in the semantic concordances. <B>grind() </B> uses the data from <B>cntlist
</B> to order the senses of each word. When the <B>index </B>.<I>pos </I> files are generated,
the <I>synset_offset </I>s are output in sense number order, with sense 1 first
in the list. Senses with the same number of semantic tags are assigned
unique but consecutive sense numbers. The WordNet <FONT SIZE=-1><B>OVERVIEW </B></FONT>
search displays
all senses of the specified word, in all syntactic categories, and indicates
which of the senses are represented in the semantically tagged texts.
<H2><A NAME="sect6" HREF="#toc6">OPTIONS </A></H2>
<DL>
<DT><B>-v</B> </DT>
<DD>Verify integrity of input without generating database. </DD>
<DT><B>-s</B> </DT>
<DD>Suppress
generation of warning messages. Usually <B>grind </B> is run with this option
until all syntactic and structural errors are corrected since the warning
messages may make it difficult to spot error messages. </DD>
<DT><B>-L</B><I>logfile</I> </DT>
<DD>Write
all messages to <I>logfile </I> instead of <B>stderr </B>. </DD>
<DT><B>-a</B> </DT>
<DD>Generate statistical report
on input files processed. </DD>
<DT><B>-d</B> </DT>
<DD>Generate distribution of senses by string
length report on input files processed. </DD>
<DT><B>-i</B> </DT>
<DD>Generate sense index file. </DD>
<DT><B>-o</B>
</DT>
<DD>Order senses using <B>cntlist </B>. </DD>
<DT><B>-n</B> </DT>
<DD>Generate nominalization (derivational
morphology) links in database. </DD>
<DT><I>filename</I> </DT>
<DD>Input file of the form described
in <FONT SIZE=-1><B>Input </B></FONT>
</DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">FILES </A></H2>
<DL>
<DT><B><I>pos </I>.*</B> </DT>
<DD>lexicographer files to use to build database
</DD>
<DT><B>cntlist</B> </DT>
<DD>file of combined semantic concordance <B>cntlist </B> files. Used to
assign sense numbers in WordNet database </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">SEE ALSO </A></H2>
<B><A HREF="cntlist.5WN.html">cntlist</B>(5WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
,
<B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
, <B><A HREF="uniqbeg.7WN.html">uniqbeg</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
.
<H2><A NAME="sect9" HREF="#toc9">DIAGNOSTICS
</A></H2>
Exit status is normally 0. Exit status is -1 if non-specific error occurs.
If syntactic or structural errors exist, exit status is number of errors
detected.
<DL>
<DT><B>usage: grind [-v] [-s] [-Llogfile] [-a ] [-d] [-i] [-o] [-n] filename
[filename...]</B> </DT>
<DD>Invalid options were specified on the command line. </DD>
<DT><B>No input
files processed.</B> </DT>
<DD>None of the filenames specified were of the appropriate
form. </DD>
<DT><B><I>n </I> syntactic errors found.</B> </DT>
<DD>Syntax errors were found while parsing
the input files. </DD>
<DT><B><I>n </I> structural errors found.</B> </DT>
<DD>Pointer errors were found
that could not be automatically corrected. </DD>
</DL>
<H2><A NAME="sect10" HREF="#toc10">BUGS </A></H2>
Please report bugs to
<B>wordnet@princeton.edu </B>. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc3" HREF="#sect3">Input Files</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Output Files</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Sense Numbers</A></LI>
</UL>
<LI><A NAME="toc6" HREF="#sect6">OPTIONS</A></LI>
<LI><A NAME="toc7" HREF="#sect7">FILES</A></LI>
<LI><A NAME="toc8" HREF="#sect8">SEE ALSO</A></LI>
<LI><A NAME="toc9" HREF="#sect9">DIAGNOSTICS</A></LI>
<LI><A NAME="toc10" HREF="#sect10">BUGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,195 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>LEXNAMES(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
List of WordNet lexicographer file names and numbers
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION
</A></H2>
During WordNet development synsets are organized into forty-five lexicographer
files based on syntactic category and logical groupings. <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
processes
these files and produces a database suitable for use with the WordNet
library, interface code, and other applications. The format of the lexicographer
files is described in <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
. <P>
A file number corresponds to each
lexicographer file. File numbers are encoded in several parts of the WordNet
system as an efficient way to indicate a lexicographer file name. The
file <B>lexnames </B> lists the mapping between file names and numbers, and can
be used by programs or end users to correlate the two.
<H3><A NAME="sect2" HREF="#toc2">File Format </A></H3>
Each
line in <B>lexnames </B> contains 3 tab separated fields, and is terminated with
a newline character. The first field is the two digit decimal integer
file number. (The first file in the list is numbered <B>00 </B>.) The second
field is the name of the lexicographer file that is represented by that
number, and the third field is an integer that indicates the syntactic
category of the synsets contained in the file. This is simply a shortcut
for programs and scripts, since the syntactic category is also part of
the lexicographer file's name.
<H3><A NAME="sect3" HREF="#toc3">Syntactic Category </A></H3>
The syntactic category
field is encoded as follows: <P>
<blockquote><B>1 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>2 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>3 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>4 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
</blockquote>
<H3><A NAME="sect4" HREF="#toc4">Lexicographer Files </A></H3>
The names of the lexicographer files and their corresponding
file numbers are listed below along with a brief description each file's
contents. <P>
<blockquote> <TABLE BORDER=0>
<TR> <TD ALIGN=LEFT><B>File Number </B> </TD> <TD ALIGN=LEFT><B>Name </B> </TD> <TD ALIGN=LEFT><B>Contents </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>00 </TD> <TD ALIGN=LEFT>adj.all </TD> <TD ALIGN=LEFT>all adjective
clusters </TD> </TR>
<TR> <TD ALIGN=LEFT>01 </TD> <TD ALIGN=LEFT>adj.pert </TD> <TD ALIGN=LEFT>relational adjectives (pertainyms) </TD> </TR>
<TR> <TD ALIGN=LEFT>02 </TD> <TD ALIGN=LEFT>adv.all
</TD> <TD ALIGN=LEFT>all adverbs </TD> </TR>
<TR> <TD ALIGN=LEFT>03 </TD> <TD ALIGN=LEFT>noun.Tops </TD> <TD ALIGN=LEFT>unique beginner for nouns </TD> </TR>
<TR> <TD ALIGN=LEFT>04 </TD> <TD ALIGN=LEFT>noun.act
</TD> <TD ALIGN=LEFT>nouns denoting acts or actions </TD> </TR>
<TR> <TD ALIGN=LEFT>05 </TD> <TD ALIGN=LEFT>noun.animal </TD> <TD ALIGN=LEFT>nouns denoting animals
</TD> </TR>
<TR> <TD ALIGN=LEFT>06 </TD> <TD ALIGN=LEFT>noun.artifact </TD> <TD ALIGN=LEFT>nouns denoting man-made objects </TD> </TR>
<TR> <TD ALIGN=LEFT>07 </TD> <TD ALIGN=LEFT>noun.attribute
</TD> <TD ALIGN=LEFT>nouns denoting attributes of people and objects </TD> </TR>
<TR> <TD ALIGN=LEFT>08 </TD> <TD ALIGN=LEFT>noun.body </TD> <TD ALIGN=LEFT>nouns
denoting body parts </TD> </TR>
<TR> <TD ALIGN=LEFT>09 </TD> <TD ALIGN=LEFT>noun.cognition </TD> <TD ALIGN=LEFT>nouns denoting cognitive processes
and contents </TD> </TR>
<TR> <TD ALIGN=LEFT>10 </TD> <TD ALIGN=LEFT>noun.communication </TD> <TD ALIGN=LEFT>nouns denoting communicative processes
and contents </TD> </TR>
<TR> <TD ALIGN=LEFT>11 </TD> <TD ALIGN=LEFT>noun.event </TD> <TD ALIGN=LEFT>nouns denoting natural events </TD> </TR>
<TR> <TD ALIGN=LEFT>12
</TD> <TD ALIGN=LEFT>noun.feeling </TD> <TD ALIGN=LEFT>nouns denoting feelings and emotions </TD> </TR>
<TR> <TD ALIGN=LEFT>13 </TD> <TD ALIGN=LEFT>noun.food </TD>
<TD ALIGN=LEFT>nouns denoting foods and drinks </TD> </TR>
<TR> <TD ALIGN=LEFT>14 </TD> <TD ALIGN=LEFT>noun.group </TD> <TD ALIGN=LEFT>nouns denoting groupings
of people or objects </TD> </TR>
<TR> <TD ALIGN=LEFT>15 </TD> <TD ALIGN=LEFT>noun.location </TD> <TD ALIGN=LEFT>nouns denoting spatial position
</TD> </TR>
<TR> <TD ALIGN=LEFT>16 </TD> <TD ALIGN=LEFT>noun.motive </TD> <TD ALIGN=LEFT>nouns denoting goals </TD> </TR>
<TR> <TD ALIGN=LEFT>17 </TD> <TD ALIGN=LEFT>noun.object </TD> <TD ALIGN=LEFT>nouns denoting
natural objects (not man-made) </TD> </TR>
<TR> <TD ALIGN=LEFT>18 </TD> <TD ALIGN=LEFT>noun.person </TD> <TD ALIGN=LEFT>nouns denoting people
</TD> </TR>
<TR> <TD ALIGN=LEFT>19 </TD> <TD ALIGN=LEFT>noun.phenomenon </TD> <TD ALIGN=LEFT>nouns denoting natural phenomena </TD> </TR>
<TR> <TD ALIGN=LEFT>20 </TD> <TD ALIGN=LEFT>noun.plant
</TD> <TD ALIGN=LEFT>nouns denoting plants </TD> </TR>
<TR> <TD ALIGN=LEFT>21 </TD> <TD ALIGN=LEFT>noun.possession </TD> <TD ALIGN=LEFT>nouns denoting possession
and transfer of possession </TD> </TR>
<TR> <TD ALIGN=LEFT>22 </TD> <TD ALIGN=LEFT>noun.process </TD> <TD ALIGN=LEFT>nouns denoting natural
processes </TD> </TR>
<TR> <TD ALIGN=LEFT>23 </TD> <TD ALIGN=LEFT>noun.quantity </TD> <TD ALIGN=LEFT>nouns denoting quantities and units of
measure </TD> </TR>
<TR> <TD ALIGN=LEFT>24 </TD> <TD ALIGN=LEFT>noun.relation </TD> <TD ALIGN=LEFT>nouns denoting relations between people
or things or ideas </TD> </TR>
<TR> <TD ALIGN=LEFT>25 </TD> <TD ALIGN=LEFT>noun.shape </TD> <TD ALIGN=LEFT>nouns denoting two and three dimensional
shapes </TD> </TR>
<TR> <TD ALIGN=LEFT>26 </TD> <TD ALIGN=LEFT>noun.state </TD> <TD ALIGN=LEFT>nouns denoting stable states of affairs </TD>
</TR>
<TR> <TD ALIGN=LEFT>27 </TD> <TD ALIGN=LEFT>noun.substance </TD> <TD ALIGN=LEFT>nouns denoting substances </TD> </TR>
<TR> <TD ALIGN=LEFT>28 </TD> <TD ALIGN=LEFT>noun.time </TD> <TD ALIGN=LEFT>nouns
denoting time and temporal relations </TD> </TR>
<TR> <TD ALIGN=LEFT>29 </TD> <TD ALIGN=LEFT>verb.body </TD> <TD ALIGN=LEFT>verbs of grooming,
dressing and bodily care </TD> </TR>
<TR> <TD ALIGN=LEFT>30 </TD> <TD ALIGN=LEFT>verb.change </TD> <TD ALIGN=LEFT>verbs of size, temperature
change, intensifying, etc. </TD> </TR>
<TR> <TD ALIGN=LEFT>31 </TD> <TD ALIGN=LEFT>verb.cognition </TD> <TD ALIGN=LEFT>verbs of thinking, judging,
analyzing, doubting </TD> </TR>
<TR> <TD ALIGN=LEFT>32 </TD> <TD ALIGN=LEFT>verb.communication </TD> <TD ALIGN=LEFT>verbs of telling, asking,
ordering, singing </TD> </TR>
<TR> <TD ALIGN=LEFT>33 </TD> <TD ALIGN=LEFT>verb.competition </TD> <TD ALIGN=LEFT>verbs of fighting, athletic
activities </TD> </TR>
<TR> <TD ALIGN=LEFT>34 </TD> <TD ALIGN=LEFT>verb.consumption </TD> <TD ALIGN=LEFT>verbs of eating and drinking </TD> </TR>
<TR> <TD ALIGN=LEFT>35 </TD> <TD ALIGN=LEFT>verb.contact </TD> <TD ALIGN=LEFT>verbs of touching, hitting, tying, digging </TD> </TR>
<TR> <TD ALIGN=LEFT>36 </TD>
<TD ALIGN=LEFT>verb.creation </TD> <TD ALIGN=LEFT>verbs of sewing, baking, painting, performing </TD> </TR>
<TR> <TD ALIGN=LEFT>37 </TD> <TD ALIGN=LEFT>verb.emotion
</TD> <TD ALIGN=LEFT>verbs of feeling </TD> </TR>
<TR> <TD ALIGN=LEFT>38 </TD> <TD ALIGN=LEFT>verb.motion </TD> <TD ALIGN=LEFT>verbs of walking, flying, swimming
</TD> </TR>
<TR> <TD ALIGN=LEFT>39 </TD> <TD ALIGN=LEFT>verb.perception </TD> <TD ALIGN=LEFT>verbs of seeing, hearing, feeling </TD> </TR>
<TR> <TD ALIGN=LEFT>40 </TD> <TD ALIGN=LEFT>verb.possession
</TD> <TD ALIGN=LEFT>verbs of buying, selling, owning </TD> </TR>
<TR> <TD ALIGN=LEFT>41 </TD> <TD ALIGN=LEFT>verb.social </TD> <TD ALIGN=LEFT>verbs of political
and social activities and events </TD> </TR>
<TR> <TD ALIGN=LEFT>42 </TD> <TD ALIGN=LEFT>verb.stative </TD> <TD ALIGN=LEFT>verbs of being,
having, spatial relations </TD> </TR>
<TR> <TD ALIGN=LEFT>43 </TD> <TD ALIGN=LEFT>verb.weather </TD> <TD ALIGN=LEFT>verbs of raining, snowing,
thawing, thundering </TD> </TR>
<TR> <TD ALIGN=LEFT>44 </TD> <TD ALIGN=LEFT>adj.ppl </TD> <TD ALIGN=LEFT>participial adjectives </TD> </TR>
</TABLE>
</blockquote>
<H2><A NAME="sect5" HREF="#toc5">NOTES
</A></H2>
The lexicographer files are not included in the WordNet database package.
<H2><A NAME="sect6" HREF="#toc6">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet. Default
is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the WordNet database
has been installed. Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B>
</DT>
<DD>Base directory for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">FILES
</A></H2>
<DL>
<DT><B>lexnames</B> </DT>
<DD>list of lexicographer file names and numbers </DD>
</DL>
<H2><A NAME="sect9" HREF="#toc9">SEE ALSO </A></H2>
<B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
,
<B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">File Format</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Syntactic Category</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Lexicographer Files</A></LI>
</UL>
<LI><A NAME="toc5" HREF="#sect5">NOTES</A></LI>
<LI><A NAME="toc6" HREF="#sect6">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc7" HREF="#sect7">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc8" HREF="#sect8">FILES</A></LI>
<LI><A NAME="toc9" HREF="#sect9">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,109 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>MORPH(3WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
morphinit, re_morphinit, morphstr, morphword
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>#include
"wn.h" </B> <P>
<B>int morphinit(void); </B> <P>
<B>int re_morphinit(void); </B> <P>
<B>char *morphstr(char
*origstr, int pos); </B> <P>
<B>char *morphword(char *word, int pos); </B>
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION
</A></H2>
<P>
The WordNet morphological processor, Morphy, is accessed through these
functions: <P>
<B>morphinit()</B> is used to open the exception list files. It returns
<B>0 </B> if successful, <B>-1 </B> otherwise. The exception list files must be opened
before <B>morphstr() </B> or <B>morphword()</B> are called. <P>
<B>re_morphinit()</B> is used to
close the exception list files and reopen them, and is used exclusively
for WordNet development. Return codes are as described above. <P>
<B>morphstr()</B>
is the basic user interface to Morphy. It tries to find the base form
(lemma) of the word or collocation <I>origstr </I> in the specified <I>pos </I>. The
first call (with <I>origstr </I> specified) returns a pointer to the first base
form found. Subsequent calls requesting base forms of the same string
must be made with the first argument of <FONT SIZE=-1><B>NULL. </B></FONT>
When no more base forms
for <I>origstr </I> can be found, <FONT SIZE=-1><B>NULL </B></FONT>
is returned. Note that <B>morphstr() </B> returns
a pointer to a static character buffer. A subsequent call to <B>morphstr()
</B> with a new string (instead of <B>NULL </B>) will overwrite the string pointed
to by a previous call. Users should copy the returned string into a local
buffer, or use the C library function <B>strdup </B> to duplicate the returned
string into a <I>malloc'd </I> buffer. <P>
<B>morphword()</B> tries to find the base form
of <I>word </I> in the specified <I>pos </I>. This function is called by <B>morphstr()</B> for
each individual word in a collocation. Note that <B>morphword() </B> returns a
pointer to a static character buffer. A subsequent call to <B>morphword()
</B> will overwrite the string pointed to by a previous call. Users should
copy the returned string into a local buffer, or use the C library function
<B>strdup </B> to duplicate the returned string into a <I>malloc'd </I> buffer.
<H2><A NAME="sect3" HREF="#toc3">NOTES
</A></H2>
<B>morphinit()</B> is called by <B>wninit() </B> and is not intended to be called directly
by an application. Applications wishing to use WordNet and/or the morphological
functions must call <B>wninit() </B> at the start of the program. See <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A>
for more information. <P>
<I>origstr </I> may be either a word or a collocation formed
by joining individual words with underscore characters (<B>_ </B>). <P>
Usually only
<B>morphstr() </B> is called from applications, as it works on both words and
collocations. <P>
<I>pos </I> must be one of the following: <P>
<blockquote><B>1 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>2 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>3 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE
<BR>
<B>4 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
<B>5 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE_SATELLITE <BR>
</blockquote>
<P>
If <FONT SIZE=-1><B>ADJECTIVE_SATELLITE </B></FONT>
is passed,
it is treated by <B>morphstr() </B> as <FONT SIZE=-1><B>ADJECTIVE. </B></FONT>
<H2><A NAME="sect4" HREF="#toc4">SEE ALSO </A></H2>
<B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
,
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="morphy.7WN.html">morphy</B>(7WN)</A>
. <P>
<H2><A NAME="sect5" HREF="#toc5">WARNINGS </A></H2>
Passing an invalid part of speech will
result in a core dump. <P>
The WordNet database files must be open to use
<B>morphstr() </B> or <B>morphword(). <P>
</B>
<H2><A NAME="sect6" HREF="#toc6">BUGS </A></H2>
Morphy will allow non-words to be converted
to words, if they follow one of the rules described above. For example,
it will happily convert <B>plantes </B> to <B>plants </B>. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">NOTES</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEE ALSO</A></LI>
<LI><A NAME="toc5" HREF="#sect5">WARNINGS</A></LI>
<LI><A NAME="toc6" HREF="#sect6">BUGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,221 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>MORPHY(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
morphy - discussion of WordNet's morphological processing
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION
</A></H2>
Although only base forms of words are usually stored in WordNet, searches
may be done on inflected forms. A set of morphology functions, Morphy,
is applied to the search string to generate a form that is present in
WordNet. <P>
Morphology in WordNet uses two types of processes to try to convert
the string passed into one that can be found in the WordNet database. There
are lists of inflectional endings, based on syntactic category, that can
be detached from individual words in an attempt to find a form of the
word that is in WordNet. There are also exception list files, one for
each syntactic category, in which a search for an inflected form is done.
Morphy tries to use these two processes in an intelligent manner to translate
the string passed to the base form found in WordNet. Morphy first checks
for exceptions, then uses the rules of detachment. The Morphy functions
are not independent from WordNet. After each transformation, WordNet is
searched for the resulting string in the syntactic category specified.
<P>
The Morphy functions are passed a string and a syntactic category. A
string is either a single word or a collocation. Since some words, such
as <B>axes </B> can have more than one base form (<B>axe </B> and <B>axis </B>), Morphy works
in the following manner. The first time that Morphy is called with a specific
string, it returns a base form. For each subsequent call to Morphy made
with a <FONT SIZE=-1><B>NULL </B></FONT>
string argument, Morphy returns another base form. Whenever
Morphy cannot perform a transformation, whether on the first call for
a word or subsequent calls, <FONT SIZE=-1><B>NULL </B></FONT>
is returned. A transformation to a
valid English string will return <FONT SIZE=-1><B>NULL </B></FONT>
if the base form of the string
is not in WordNet. <P>
The morphological functions are found in the WordNet
library. See <B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
for information on using these functions.
<H3><A NAME="sect2" HREF="#toc2">Rules
of Detachment </A></H3>
The following table shows the rules of detachment used by
Morphy. If a word ends with one of the suffixes, it is stripped from the
word and the corresponding ending is added. Then WordNet is searched for
the resulting string. No rules are applicable to adverbs. <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>POS </B> </TD> <TD ALIGN=CENTER><B>Suffix
</B> </TD> <TD ALIGN=CENTER><B>Ending </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"s" </TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"ses" </TD> <TD ALIGN=LEFT>"s" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"xes" </TD> <TD ALIGN=LEFT>"x" </TD>
</TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"zes" </TD> <TD ALIGN=LEFT>"z" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"ches" </TD> <TD ALIGN=LEFT>"ch" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"shes" </TD> <TD ALIGN=LEFT>"sh" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN
</TD> <TD ALIGN=LEFT>"men" </TD> <TD ALIGN=LEFT>"man" </TD> </TR>
<TR> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>"ies" </TD> <TD ALIGN=LEFT>"y" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"s" </TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"ies" </TD> <TD ALIGN=LEFT>"y"
</TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"es" </TD> <TD ALIGN=LEFT>"e" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"es" </TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"ed" </TD> <TD ALIGN=LEFT>"e" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"ed"
</TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"ing" </TD> <TD ALIGN=LEFT>"e" </TD> </TR>
<TR> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>"ing" </TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=LEFT>"er" </TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=LEFT>"est"
</TD> <TD ALIGN=LEFT>"" </TD> </TR>
<TR> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=LEFT>"er" </TD> <TD ALIGN=LEFT>"e" </TD> </TR>
<TR> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=LEFT>"est" </TD> <TD ALIGN=LEFT>"e" </TD> </TR>
</TABLE>
<H3><A NAME="sect3" HREF="#toc3">Exception Lists </A></H3>
There is one
exception list file for each syntactic category. The exception lists contain
the morphological transformations for strings that are not regular and
therefore cannot be processed in an algorithmic manner. Each line of an
exception list contains an inflected form of a word or collocation, followed
by one or more base forms. The list is kept in alphabetical order and
a binary search is used to find words in these lists. See <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
for
information on the format of the exception list files.
<H3><A NAME="sect4" HREF="#toc4">Single Words </A></H3>
In
general, single words are relatively easy to process. Morphy first looks
for the word in the exception list. If it is found the first base form
is returned. Subsequent calls with a <FONT SIZE=-1><B>NULL </B></FONT>
argument return additional
base forms, if present. A <FONT SIZE=-1><B>NULL </B></FONT>
is returned when there are no more base
forms of the word. <P>
If the word is not found in the exception list corresponding
to the syntactic category, an algorithmic process using the rules of detachment
looks for a matching suffix. If a matching suffix is found, a corresponding
ending is applied (sometimes this ending is a <FONT SIZE=-1><B>NULL </B></FONT>
string, so in effect
the suffix is removed from the word), and WordNet is consulted to see
if the resulting word is found in the desired part of speech.
<H3><A NAME="sect5" HREF="#toc5">Collocations
</A></H3>
As opposed to single words, collocations can be quite difficult to transform
into a base form that is present in WordNet. In general, only base forms
of words, even those comprising collocations, are stored in WordNet, such
as <B>attorney&nbsp;general </B>. Transforming the collocation <B>attorneys&nbsp;general </B>
is then simply a matter of finding the base forms of the individual words
comprising the collocation. This usually works for nouns, therefore non-conforming
nouns, such as <B>customs&nbsp;duty </B> are presently entered in the noun exception
list. <P>
Verb collocations that contain prepositions, such as <B>ask&nbsp;for&nbsp;it
</B>, are more difficult. As with single words, the exception list is searched
first. If the collocation is not found, special code in Morphy determines
whether a verb collocation includes a preposition. If it does, a function
is called to try to find the base form in the following manner. It is
assumed that the first word in the collocation is a verb and that the
last word is a noun. The algorithm then builds a search string with the
base forms of the verb and noun, leaving the remainder of the collocation
(usually just the preposition, but more words may be involved) in the
middle. For example, passed <B>asking&nbsp;for&nbsp;it </B>, the database search would
be performed with <B>ask&nbsp;for&nbsp;it </B>, which is found in WordNet, and therefore
returned from Morphy. If a verb collocation does not contain a preposition,
then the base form of each word in the collocation is found and WordNet
is searched for the resulting string.
<H3><A NAME="sect6" HREF="#toc6">Hyphenation </A></H3>
Hyphenation also presents
special difficulties when searching WordNet. It is often a subjective decision
as to whether a word is hyphenated, joined as one word, or is a collocation
of several words, and which of the various forms are entered into WordNet.
When Morphy breaks a string into "words", it looks for both spaces and
hyphens as delimiters. It also looks for periods in strings and removes
them if an exact match is not found. A search for an abbreviation like
<B>oct. </B> return the synset for <B>{&nbsp;October,&nbsp;Oct&nbsp;} </B>. Not every pattern of hyphenated
and collocated string is searched for properly, so it may be advantageous
to specify several search strings if the results of a search attempt seem
incomplete.
<H3><A NAME="sect7" HREF="#toc7">Special Processing for nouns ending with 'ful' </A></H3>
Morphy contains
code that searches for nouns ending with <B>ful </B> and performs a transformation
on the substring preceeding it. It then appends 'ful' back onto the resulting
string and returns it. For example, if passed the nouns <B>boxesful </B>, it will
return <B>boxful </B>.
<H2><A NAME="sect8" HREF="#toc8">BUGS </A></H2>
Since many noun collocations contains prepositions,
such as <B>line&nbsp;of&nbsp;products </B>, an algorithm similar to that used for verbs
should be written for nouns. In the present scheme, if Morphy is passed
<B>lines&nbsp;of&nbsp;products </B>, the search string becomes <B>line&nbsp;of&nbsp;product </B>, which
is not in WordNet <P>
Morphy will allow non-words to be converted to words,
if they follow one of the rules described above. For example, it will
happily convert <B>plantes </B> to <B>plants </B>.
<H2><A NAME="sect9" HREF="#toc9">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B>
</DT>
<DD>Base directory for WordNet. Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B>
</DT>
<DD>Directory in which the WordNet database has been installed. Default
is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect10" HREF="#toc10">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B>
</DT>
<DD>Base directory for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect11" HREF="#toc11">FILES
</A></H2>
<DL>
<DT><B><I>pos </I>.exc</B> </DT>
<DD>morphology exception lists </DD>
</DL>
<H2><A NAME="sect12" HREF="#toc12">SEE ALSO </A></H2>
<B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
,
<B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.7WN.html">wninput</B>(7WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">Rules of Detachment</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Exception Lists</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Single Words</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Collocations</A></LI>
<LI><A NAME="toc6" HREF="#sect6">Hyphenation</A></LI>
<LI><A NAME="toc7" HREF="#sect7">Special Processing for nouns ending with 'ful'</A></LI>
</UL>
<LI><A NAME="toc8" HREF="#sect8">BUGS</A></LI>
<LI><A NAME="toc9" HREF="#sect9">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc10" HREF="#sect10">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc11" HREF="#sect11">FILES</A></LI>
<LI><A NAME="toc12" HREF="#sect12">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,318 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>PROLOGDB(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wn_pl - description of Prolog database files
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
The files
<B>wn_ </B><I>* </I><B>.pl </B> contain the WordNet database in a prolog-readable format. A prolog
interface to WordNet is not implemented. <P>
The prolog database is very large
and may take many minutes to load into the Prolog workspace. A separate
file has been created for each WordNet relation giving the user the ability
to load only those parts of the database that they are interested. <P>
See
<B>FILES </B>, below, for a list of the database files and <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
and <B><A HREF="wninput.5WN.html">wninput</B>(5WN)<B></B></A>
for detailed descriptions of the various WordNet relations (referred to
as <I>operators </I> in this manual page).
<H3><A NAME="sect2" HREF="#toc2">File Format </A></H3>
Each prolog database file
contains information corresponding to the synsets and word senses contained
in the WordNet database. In the prolog version of the database, the <I>synset_id
</I>s (defined below) are used as unique synset identifiers. <P>
Each line of
a file contains an operator that corresponds to a WordNet relation. All
lines with the same <I>operator </I> value are stored in the file <B>wn_ </B><I>operator
</I><B>.pl </B>. <P>
The general format of a line in a prolog database file is as follows:
<P>
<blockquote><I>operator<B>(<I>field1<B>,<I>&nbsp;&nbsp;...&nbsp;&nbsp;<B>,<I>fieldn<B>). </B></I></B></I></B></I></B></I> <BR>
</blockquote>
<P>
Each line contains the name of the
operator, followed by a left parenthesis, a comma-separated list of fields,
a right parenthesis, and a period. Note there are no spaces, and each
line is terminated with a newline character.
<H3><A NAME="sect3" HREF="#toc3">Operators </A></H3>
Each WordNet relation
is represented in a separate file by <I>operator </I> name. Some operators are
reflexive (i.e. the "reverse" relation is implicit). So, for example, if
<B>x </B> is a hypernym of <B>y </B>, <B>y </B> is necessarily a hyponym of <B>x </B>. In the prolog
database, reflected pointers are usually implied for semantic relations.
<P>
Semantic relations are represented by a pair of <I>synset_id </I>s, in which
the first <I>synset_id </I> is generally the source of the relation and the second
is the target. If two pairs <I>synset_id </I><B>, </B><I>w_num </I> are present, the operator
represents a lexical relation between word forms. <P>
<B>s(<I>synset_id<B>,<I>w_num<B>,'<I>word<B>',<I>ss_type<B>,<I>sense_number<B>,<I>tag_count<B>).
</B></I></B></I></B></I></B></I></B></I></B></I></B><BR>
<blockquote>A <B>s </B> operator is present for every word sense in WordNet. In <B>wn_s.pl
</B>, <I>w_num </I> specifies the word number for <I>word </I> in the synset. </blockquote>
<P>
<B>g(<I>synset_id<B>,'(<I>gloss<B>)').
</B></I></B></I></B><BR>
<blockquote>The <B>g </B> operator specifies the gloss for a synset. </blockquote>
<P>
<B>hyp(<I>synset_id<B>,<I>synset_id<B>).
</B></I></B></I></B><BR>
<blockquote>The <B>hyp </B> operator specifies that the second synset is a hypernym of
the first synset. This relation holds for nouns and verbs. The reflexive
operator, hyponym, implies that the first synset is a hyponym of the second
synset. </blockquote>
<P>
<B>ent(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>ent </B> operator specifies that the
second synset is an entailment of first synset. This relation only holds
for verbs. </blockquote>
<P>
<B>sim(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>sim </B> operator specifies that
the second synset is similar in meaning to the first synset. This means
that the second synset is a satellite the first synset, which is the cluster
head. This relation only holds for adjective synsets contained in adjective
clusters. </blockquote>
<P>
<B>mm(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>mm </B> operator specifies that the
second synset is a member meronym of the first synset. This relation only
holds for nouns. The reflexive operator, member holonym, can be implied.
</blockquote>
<P>
<B>ms(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>ms </B> operator specifies that the second
synset is a substance meronym of the first synset. This relation only
holds for nouns. The reflexive operator, substance holonym, can be implied.
</blockquote>
<P>
<B>mp(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>mp </B> operator specifies that the second
synset is a part meronym of the first synset. This relation only holds
for nouns. The reflexive operator, part holonym, can be implied. </blockquote>
<P>
<B>cs(<I>synset_id<B>,<I>synset_id<B>).
</B></I></B></I></B><BR>
<blockquote>The <B>cs </B> operator specifies that the second synset is a cause of the
first synset. This relation only holds for verbs. </blockquote>
<P>
<B>vgp(<I>synset_id<B>,<I>synset_id<B>).
</B></I></B></I></B><BR>
<blockquote>The <B>vgp </B> operator specifies verb synsets that are similar in meaning
and should be grouped together when displayed in response to a grouped
synset search. </blockquote>
<P>
<B>at(<I>synset_id<B>,<I>synset_id<B>). </B></I></B></I></B><BR>
<blockquote>The <B>at </B> operator defines the
attribute relation between noun and adjective synset pairs in which the
adjective is a value of the noun. For each pair, both relations are listed
(ie. each <I>synset_id </I> is both a source and target). </blockquote>
<P>
<B>ant(<I>synset_id<B>,<I>w_num<B>,<I>synset_id<B>,<I>w_num<B>).
</B></I></B></I></B></I></B></I></B><BR>
<blockquote>The <B>ant </B> operator specifies antonymous <I>word </I>s. This is a lexical relation
that holds for all syntactic categories. For each antonymous pair, both
relations are listed (ie. each <I>synset_id,w_num </I> pair is both a source and
target word.) </blockquote>
<P>
<B>sa(<I>synset_id<B>,<I>w_num<B>,<I>synset_id<B>,<I>w_num<B>). </B></I></B></I></B></I></B></I></B><BR>
<blockquote>The <B>sa </B> operator
specifies that additional information about the first word can be obtained
by seeing the second word. This operator is only defined for verbs and
adjectives. There is no reflexive relation (ie. it cannot be inferred that
the additional information about the second word can be obtained from
the first word). </blockquote>
<P>
<B>ppl(<I>synset_id<B>,<I>w_num<B>,<I>synset_id<B>,<I>w_num<B>). </B></I></B></I></B></I></B></I></B><BR>
<blockquote>The <B>ppl </B> operator
specifies that the adjective first word is a participle of the verb second
word. The reflexive operator can be implied. </blockquote>
<P>
<B>per(<I>synset_id<B>,<I>w_num<B>,<I>synset_id<B>,<I>w_num<B>).
</B></I></B></I></B></I></B></I></B><BR>
<blockquote>The <B>per </B> operator specifies two different relations based on the parts
of speech involved. If the first word is in an adjective synset, that
word pertains to either the noun or adjective second word. If the first
word is in an adverb synset, that word is derived from the adjective second
word. </blockquote>
<P>
<B>fr(<I>synset_id<B>,<I>f_num<B>,<I>w_num<B>). </B></I></B></I></B></I></B><BR>
<blockquote>The <B>fr </B> operator specifies a generic
sentence frame for one or all words in a synset. The operator is defined
only for verbs. </blockquote>
<H3><A NAME="sect4" HREF="#toc4">Field Definitions </A></H3>
A <I>synset_id </I> is a nine byte field in
which the first byte defines the syntactic category of the synset and
the remaining eight bytes are a <I>synset_offset </I>, as defined in <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
,
indicating the byte offset in the <B>data. </B><I>pos </I> file that corresponds to the
syntactic category. <P>
The syntactic category is encoded as: <P>
<blockquote><B>1 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>2 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>3 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>4 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
</blockquote>
<P>
<I>w_num </I>, if present, indicates which word
in the synset is being referred to. Word numbers are assigned to the <I>word
</I> fields in a synset, from left to right, beginning with 1. When used to
represent lexical WordNet relations <I>w_num </I> may be 0, indicating that the
relation holds for all words in the synset indicated by the preceding
<I>synset_id </I>. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a discussion of semantic and lexical
relations. <P>
<I>ss_type </I> is a one character code indicating the synset type:
<P>
<blockquote><B>n </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>v </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>a </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>s </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE&nbsp;SATELLITE <BR>
<B>r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
</blockquote>
<P>
<I>sense_number
</I> specifies the sense number of the word, within the part of speech encoded
in the <I>synset_id </I>, in the WordNet database. <P>
<I>word </I> is the ASCII text of
the word as entered in the synset by the lexicographer, with spaces replaced
by underscore characters (<B>_ </B>). The text of the word is case sensitive.
An adjective <I>word </I> is immediately followed by a syntactic marker if one
was specified in the lexicographer file. A syntactic marker is appended,
in parentheses, onto <I>word </I> without any intervening spaces. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a list of the syntactic markers for adjectives. <P>
Each synset has a
<I>gloss </I> that may contain a definition, one or more example sentences, or
both. Note that glosses are enclosed in single forward quotes and parentheses:&nbsp;&nbsp;<B>'(<I>gloss<B>)'
</B></I></B>. <P>
<I>f_num </I> specifies the generic sentence frame number for word <I>w_num </I> in
the synset indicated by <I>synset_id </I>. Note that when <I>w_num </I> is <B>0 </B>, the frame
number applies to all words in the synset. If non-zero, the frame applies
to that word in the synset. <P>
In WordNet, sense numbers are assigned as
described in <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
. <I>tag_count </I> is the number of times the sense was
tagged in the Semantic Concordances, and <B>0 </B> if it was not instantiated.
<H2><A NAME="sect5" HREF="#toc5">NOTES </A></H2>
Since single forward quotes are used to enclose character strings,
single quote characters found in <I>word </I> and <I>gloss </I> fields are represented
as two adjacent single quote characters. <P>
The load time can be greatly
reduced by creating "object language" versions of the files, an option
that is supported by some implementations, such as Quintus Prolog.
<H2><A NAME="sect6" HREF="#toc6">ENVIRONMENT
VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet. Default is <B>/usr/local/WordNet-3.0
</B>. </DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory
for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">FILES </A></H2>
All files are
in <B>WNHOME/prolog </B> on Unix platforms and <B>WNHome\prolog </B> on Windows platforms
<DL>
<DT><B>wn_s.pl</B> </DT>
<DD>synset pointers </DD>
<DT><B>wn_g.pl</B> </DT>
<DD>gloss pointers </DD>
<DT><B>wn_hyp.pl</B> </DT>
<DD>hypernym pointers
</DD>
<DT><B>wn_ent.pl</B> </DT>
<DD>entailment pointers </DD>
<DT><B>wn_sim.pl</B> </DT>
<DD>similar pointers </DD>
<DT><B>wn_mm.pl</B> </DT>
<DD>member
meronym pointers </DD>
<DT><B>wn_ms.pl</B> </DT>
<DD>substance meronym pointers </DD>
<DT><B>wn_mp.pl</B> </DT>
<DD>part meronym
pointers </DD>
<DT><B>wn_cs.pl</B> </DT>
<DD>cause pointers </DD>
<DT><B>wn_vgp.pl</B> </DT>
<DD>grouped verb pointers </DD>
<DT><B>wn_at.pl</B>
</DT>
<DD>attribute pointers </DD>
<DT><B>wn_ant.pl</B> </DT>
<DD>antonym pointers </DD>
<DT><B>wn_sa.pl</B> </DT>
<DD>see also pointers
</DD>
<DT><B>wn_ppl.pl</B> </DT>
<DD>participle pointers </DD>
<DT><B>wn_per.pl</B> </DT>
<DD>pertainym pointers </DD>
<DT><B>wn_fr.pl</B> </DT>
<DD>frame
pointers </DD>
</DL>
<H2><A NAME="sect9" HREF="#toc9">SEE ALSO </A></H2>
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
, <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
, <B><A HREF="wnpkgs.7WN.html">wnpkgs</B>(7WN)</A>
.
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">File Format</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Operators</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Field Definitions</A></LI>
</UL>
<LI><A NAME="toc5" HREF="#sect5">NOTES</A></LI>
<LI><A NAME="toc6" HREF="#sect6">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc7" HREF="#sect7">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc8" HREF="#sect8">FILES</A></LI>
<LI><A NAME="toc9" HREF="#sect9">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,184 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>SENSEIDX(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
index.sense, sense.idx - WordNet's sense index
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
The WordNet
sense index provides an alternate method for accessing synsets and word
senses in the WordNet database. It is useful to applications that retrieve
synsets or other information related to a specific sense in WordNet, rather
than all the senses of a word or collocation. It can also be used with
tools like <B>grep </B> and Perl to find all senses of a word in one or more
parts of speech. A specific WordNet sense, encoded as a <I>sense_key </I>, can
be used as an index into this file to obtain its WordNet sense number,
the database byte offset of the synset containing the sense, and the number
of times it has been tagged in the semantic concordance texts. <P>
Concatenating
the <I>lemma </I> and <I>lex_sense </I> fields of a semantically tagged word (represented
in a <B>&lt;wf&nbsp; </B>...&nbsp;<B>&gt; </B> attribute/value pair) in a semantic concordance file, using
<B>% </B> as the concatenation character, creates the <I>sense_key </I> for that sense,
which can in turn be used to search the sense index file. <P>
A <I>sense_key
</I> is the best way to represent a sense in semantic tagging or other systems
that refer to WordNet senses. <I>sense_key </I>s are independent of WordNet sense
numbers and <I>synset_offset </I>s, which vary between versions of the database.
Using the sense index and a <I>sense_key </I>, the corresponding synset (via
the <I>synset_offset </I>) and WordNet sense number can easily be obtained. A
mapping from noun <I>sense_key </I>s in WordNet 1.6 to corresponding 2.0 <I>sense_key
</I>s is provided with version 2.0, and is described in <B><A HREF="sensemap.5WN.html">sensemap</B>(5WN)</A>
. <P>
See
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
for a thorough discussion of the WordNet database files.
<H3><A NAME="sect2" HREF="#toc2">File
Format </A></H3>
The sense index file lists all of the senses in the WordNet database
with each line representing one sense. The file is in alphabetical order,
fields are separated by one space, and each line is terminated with a
newline character. <P>
Each line is of the form: <P>
<blockquote><I>sense_key&nbsp;&nbsp;synset_offset&nbsp;&nbsp;sense_number&nbsp;&nbsp;tag_cnt
</I> </blockquote>
<P>
<I>sense_key </I> is an encoding of the word sense. Programs can construct
a sense key in this format and use it as a binary search key into the
sense index file. The format of a <I>sense_key </I> is described below. <P>
<I>synset_offset
</I> is the byte offset that the synset containing the sense is found at in
the database "data" file corresponding to the part of speech encoded in
the <I>sense_key </I>. <I>synset_offset </I> is an 8 digit, zero-filled decimal integer,
and can be used with <B><A HREF="fseek.3.html">fseek</B>(3)</A>
to read a synset from the data file. When
passed to the WordNet library function <B>read_synset() </B> along with the syntactic
category, a data structure containing the parsed synset is returned. <P>
<I>sense_number
</I> is a decimal integer indicating the sense number of the word, within
the part of speech encoded in <I>sense_key </I>, in the WordNet database. See
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
for information about how sense numbers are assigned. <P>
<I>tag_cnt
</I> represents the decimal number of times the sense is tagged in various
semantic concordance texts. A <I>tag_cnt </I> of <B>0 </B> indicates that the sense
has not been semantically tagged.
<H3><A NAME="sect3" HREF="#toc3">Sense Key Encoding </A></H3>
A <I>sense_key </I> is represented
as: <P>
<blockquote><I>lemma </I><B>% </B><I>lex_sense </I> </blockquote>
<P>
where <I>lex_sense </I> is encoded as: <P>
<blockquote><I>ss_type<B>:<I>lex_filenum<B>:<I>lex_id<B>:<I>head_word<B>:<I>head_id
</I></B></I></B></I></B></I></B></I> </blockquote>
<P>
<I>lemma </I> is the ASCII text of the word or collocation as found in the
WordNet database index file corresponding to <I>pos </I>. <I>lemma </I> is in lower case,
and collocations are formed by joining individual words with an underscore
(<B>_ </B>) character. <P>
<I>ss_type </I> is a one digit decimal integer representing the
synset type for the sense. See <FONT SIZE=-1><B>Synset Type </B></FONT>
below for a listing of the
numbers corresponding to each synset type. <P>
<I>lex_filenum </I> is a two digit
decimal integer representing the name of the lexicographer file containing
the synset for the sense. See <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
for the list of lexicographer
file names and their corresponding numbers. <P>
<I>lex_id </I> is a two digit decimal
integer that, when appended onto <I>lemma </I>, uniquely identifies a sense within
a lexicographer file. <I>lex_id </I> numbers usually start with <B>00 </B>, and are incremented
as additional senses of the word are added to the same file, although
there is no requirement that the numbers be consecutive or begin with
<B>00 </B>. Note that a value of <B>00 </B> is the default, and therefore is not present
in lexicographer files. Only non-default <I>lex_id </I> values must be explicitly
assigned in lexicographer files. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for information on the
format of lexicographer files. <P>
<I>head_word </I> is only present if the sense
is in an adjective satellite synset. It is the lemma of the first word
of the satellite's head synset. <P>
<I>head_id </I> is a two digit decimal integer
that, when appended onto <I>head_word </I>, uniquely identifies the sense of
<I>head_word </I> within a lexicographer file, as described for <I>lex_id </I>. There
is a value in this field only if <I>head_word </I> is present.
<H3><A NAME="sect4" HREF="#toc4">Synset Type </A></H3>
The
synset type is encoded as follows: <P>
<blockquote><B>1 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>2 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>3 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>4 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB
<BR>
<B>5 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE SATELLITE <BR>
</blockquote>
<H2><A NAME="sect5" HREF="#toc5">NOTES </A></H2>
For non-satellite senses the <I>head_word
</I> and <I>head_id </I> fields have no values, however the field separator character
(<B>: </B>) is present.
<H2><A NAME="sect6" HREF="#toc6">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory
for WordNet. Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in
which the WordNet database has been installed. Default is <B>WNHOME/dict
</B>. </DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory
for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">FILES </A></H2>
<DL>
<DT><B>index.sense</B> </DT>
<DD>sense
index </DD>
</DL>
<H2><A NAME="sect9" HREF="#toc9">SEE ALSO </A></H2>
<B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
, <B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
,
<B><A HREF="sensemap.5WN.html">sensemap</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">File Format</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Sense Key Encoding</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Synset Type</A></LI>
</UL>
<LI><A NAME="toc5" HREF="#sect5">NOTES</A></LI>
<LI><A NAME="toc6" HREF="#sect6">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc7" HREF="#sect7">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc8" HREF="#sect8">FILES</A></LI>
<LI><A NAME="toc9" HREF="#sect9">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,53 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>UNIQBEG(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
uniqbeg - unique beginners for noun hierarchies
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
All
of the WordNet noun synsets are organized into hierarchies, headed by
the unique beginner synset for <B>entity </B> in the file <B>noun.Tops </B>. <P>
<blockquote>{ entity
(that which is perceived or known or inferred to have its own <BR>
distinct
existence (living or nonliving)) } <BR>
<P>
</blockquote>
<H2><A NAME="sect2" HREF="#toc2">NOTES </A></H2>
The lexicographer files are
not included in the WordNet database package.
<H2><A NAME="sect3" HREF="#toc3">FILES </A></H2>
<DL>
<DT><B>noun.Tops</B> </DT>
<DD>unique beginners
for nouns </DD>
</DL>
<H2><A NAME="sect4" HREF="#toc4">SEE ALSO </A></H2>
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
.
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<LI><A NAME="toc2" HREF="#sect2">NOTES</A></LI>
<LI><A NAME="toc3" HREF="#sect3">FILES</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,388 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WN(1WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wn - command line interface to WordNet lexical database
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS
</A></H2>
<B>wn </B> [ <I>searchstr </I> ] [ <B>-h </B>] [ <B>-g </B> ] [ <B>-a </B> ] [ <B>-l </B> ] [ <B>-o </B> ] [ <B>-s </B> ] [ <B>-n<I># </I></B> ] [
<I>search_option </I>... ]
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
<B>wn() </B> provides a command line interface
to the WordNet database, allowing synsets and relations to be displayed
as formatted text. For each word, different searches are provided, based
on syntactic category and pointer types. Although only base forms of words
are usually stored in WordNet, users may search for inflected forms. A
morphological process is applied to the search string to generate a form
that is present in WordNet. <P>
The command line interface is often useful
when writing scripts to extract information from the WordNet database.
Post-processing of the output with various scripting tools can reformat
the results as desired.
<H2><A NAME="sect3" HREF="#toc3">OPTIONS </A></H2>
<DL>
<DT><B>-h</B> </DT>
<DD>Print help text before search results.
</DD>
<DT><B>-g</B> </DT>
<DD>Display textual glosses associated with synsets. </DD>
<DT><B>-a</B> </DT>
<DD>Display lexicographer
file information. </DD>
<DT><B>-o</B> </DT>
<DD>Display synset offset of each synset. </DD>
<DT><B>-s</B> </DT>
<DD>Display each
word's sense numbers in synsets. </DD>
<DT><B>-l</B> </DT>
<DD>Display the WordNet copyright notice,
version number, and license. </DD>
<DT><B>-n<I># </I></B> </DT>
<DD>Perform search on sense number <I># </I> only.
</DD>
<DT><B>-over </B> </DT>
<DD>Display overview of all senses of <I>searchstr </I> in all syntactic categories.
</DD>
</DL>
<H3><A NAME="sect4" HREF="#toc4">Search Options </A></H3>
Note that the last letter of <I>search_option </I> generally
denotes the part of speech that the search applies to: <B>n </B> for nouns, <B>v
</B> for verbs, <B>a </B> for adjectives, and <B>r </B> for adverbs. Multiple searches may
be done for <I>searchstr </I> with a single command by specifying all the appropriate
search options. <P>
<DL>
<DT><B>-syns </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>) </DT>
<DD>Display synonyms and immediate
hypernyms of synsets containing <I>searchstr </I>. Synsets are ordered by estimated
frequency of use. For adjectives, if <I>searchstr </I> is in a head synset, the
cluster's satellite synsets are displayed in place of hypernyms. If <I>searchstr
</I> is in a satellite synset, its head synset is also displayed. </DD>
<DT><B>-simsv </B> </DT>
<DD>Display
verb synonyms and immediate hypernyms of synsets containing <I>searchstr
</I>. Synsets are grouped by similarity of meaning. </DD>
<DT><B>-ants </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>)
</DT>
<DD>Display synsets containing antonyms of <I>searchstr </I>. For adjectives, if <I>searchstr
</I> is in a head synset, <I>searchstr </I> has a direct antonym. The head synset
for the direct antonym is displayed along with the direct antonym's satellite
synsets. If <I>searchstr </I> is in a satellite synset, <I>searchstr </I> has an indirect
antonym via the head synset, which is displayed. </DD>
<DT><B>-faml </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>)
</DT>
<DD>Display familiarity and polysemy information for <I>searchstr </I>. </DD>
<DT><B>-hype </B>(<I>n </I>
| <I>v </I>) </DT>
<DD>Recursively display hypernym (superordinate) tree for <I>searchstr
</I> (<I>searchstr </I> <I>IS A KIND OF _____ </I> relation). </DD>
<DT><B>-hypo </B>(<I>n </I> | <I>v </I>) </DT>
<DD>Display immediate
hyponyms (subordinates) for <I>searchstr </I> (<I>_____ IS A KIND OF </I> <I>searchstr
</I> relation). </DD>
<DT><B>-tree </B>(<I>n </I> | <I>v </I>) </DT>
<DD>Display hyponym (subordinate) tree for <I>searchstr
</I>. This is a recursive search that finds the hyponyms of each hyponym. </DD>
<DT><B>-coor
</B>(<I>n </I> | <I>v </I>) </DT>
<DD>Display the coordinates (sisters) of <I>searchstr </I>. This search
prints the immediate hypernym for each synset that contains <I>searchstr
</I> and the hypernym's immediate hyponyms. </DD>
<DT><B>-deri </B>(<I>n </I> | <I>v </I>) </DT>
<DD>Display derivational
morphology links between noun and verb forms. </DD>
<DT><B>-domn </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>) </DT>
<DD>Display
domain that <I>searchstr </I> has been classified in. </DD>
<DT><B>-domt </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>) </DT>
<DD>Display
all terms classified as members of the <I>searchstr </I>'s domain. </DD>
<DT><B>-subsn</B> </DT>
<DD>Display
substance meronyms of <I>searchstr </I> (<I>HAS SUBSTANCE </I> relation). </DD>
<DT><B>-partn</B> </DT>
<DD>Display
part meronyms of <I>searchstr </I> (<I>HAS PART </I> relation). </DD>
<DT><B>-membn</B> </DT>
<DD>Display member
meronyms of <I>searchstr </I> (<I>HAS MEMBER </I> relation). </DD>
<DT><B>-meron</B> </DT>
<DD>Display all meronyms
of <I>searchstr </I> (<I>HAS PART, HAS MEMBER, HAS SUBSTANCE </I> relations). </DD>
<DT><B>-hmern</B>
</DT>
<DD>Display meronyms for <I>searchstr </I> tree. This is a recursive search that
prints all the meronyms of <I>searchstr </I> and all of its hypernyms. </DD>
<DT><B>-sprtn</B>
</DT>
<DD>Display <I>part of </I> holonyms of <I>searchstr </I> (<I>PART OF </I> relation). </DD>
<DT><B>-smemn</B> </DT>
<DD>Display
<I>member of </I> holonyms of <I>searchstr </I> (<I>MEMBER OF </I> relation). </DD>
<DT><B>-ssubn</B> </DT>
<DD>Display
<I>substance of </I> holonyms of <I>searchstr </I> (<I>SUBSTANCE OF </I> relation). </DD>
<DT><B>-holon</B> </DT>
<DD>Display
all holonyms of <I>searchstr </I> (<I>PART OF, MEMBER OF, SUBSTANCE OF </I> relations).
</DD>
<DT><B>-hholn</B> </DT>
<DD>Display holonyms for <I>searchstr </I> tree. This is a recursive search
that prints all the holonyms of <I>searchstr </I> and all of each holonym's holonyms.
</DD>
<DT><B>-entav</B> </DT>
<DD>Display entailment relations of <I>searchstr </I>. </DD>
<DT><B>-framv</B> </DT>
<DD>Display applicable
verb sentence frames for <I>searchstr </I>. </DD>
<DT><B>-causv</B> </DT>
<DD>Display <I>cause to </I> relations
of <I>searchstr </I>. </DD>
<DT><B> -pert </B>(<I>a </I> | <I>r </I>) </DT>
<DD>Display pertainyms of <I>searchstr </I>. </DD>
<DT><B> -attr </B>(<I>n
</I> | <I>a </I>) </DT>
<DD>Display adjective values for noun attribute, or noun attributes
of adjective values. </DD>
<DT><B>-grep </B>(<I>n </I> | <I>v </I> | <I>a </I> | <I>r </I>) </DT>
<DD>List compound words containing
<I>searchstr </I> as a substring. </DD>
</DL>
<H2><A NAME="sect5" HREF="#toc5">SEARCH RESULTS </A></H2>
The results of a search are
written to the standard output. For each search, the output consists a
one line description of the search, followed by the search results. <P>
All
searches other than <B>-over </B> list all senses matching the search results
in the following general format. Items enclosed in italicized square brackets
(<I>[&nbsp;...&nbsp;] </I>) may not be present. <P>
<blockquote>One line listing the number of senses matching
the search request. <P>
Each sense matching the search requested displayed
as follows: <P>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<B>Sense <I>n </I></B> <BR>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<I>[<B>{<I>synset_offset<B>}<I>] [<B>&lt;<I>lex_filename<B>&gt;<I>]&nbsp;&nbsp;word1[<B>#<I>sense_number][,&nbsp;&nbsp;word2...]
</I></B></I></B></I></B></I></B></I></B></I> <BR>
<P>
Where <I>n </I> is the sense number of the search word, <I>synset_offset </I> is
the byte offset of the synset in the <B>data.<I>pos </I></B> file corresponding to the
syntactic category, <I>lex_filename </I> is the name of the lexicographer file
that the synset comes from, <I>word1 </I> is the first word in the synset (note
that this is not necessarily the search word) and <I>sense_number </I> is the
WordNet sense number assigned to the preceding word. <I>synset_offset, lex_filename
</I>, and <I>sense_number </I> are generated when the <B>-o, -a, </B> and <B>-s </B> options, respectively,
are specified. <P>
The synsets matching the search requested are printed below
each sense's synset output described above. Each line of output is preceded
by a marker (usually <B>=&gt; </B>), then a synset, formatted as described above.
If a search traverses more one level of the tree, then successive lines
are indented by spaces corresponding to its level in the hierarchy. When
the <B>-g </B> option is specified, synset glosses are displayed in parentheses
at the end of each synset. Each synset is printed on one line. <P>
Senses
are generally ordered from most to least frequently used, with the most
common sense numbered <B>1 </B>. Frequency of use is determined by the number
of times a sense is tagged in the various semantic concordance texts.
Senses that are not semantically tagged follow the ordered senses. Note
that this ordering is only an estimate based on usage in a small corpus.
<P>
Verb senses can be grouped by similarity of meaning, rather than ordered
by frequency of use. The <B>-simsv </B> search prints all senses that are close
in meaning together, with a line of dashes indicating the end of a group.
See <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
for a discussion of how senses are grouped. <P>
The <B>-over
</B> search displays an overview of all the senses of the search word in all
syntactic categories. The results of this search are similar to the <B>-syns
</B> search, however no additional (ex. hypernym) synsets are displayed, and
synset glosses are always printed. The senses are grouped by syntactic
category, and each synset is annotated as described above with <I>synset_offset
</I>, <I>lex_filename </I>, and <I>sense_number </I> as dictated by the <B>-o, -a, </B> and <B>-s </B> options.
The overview search also indicates how many of the senses in each syntactic
category are represented in the tagged texts. This is a way for the user
to determine whether a sense's sense number is based on semantic tagging
data, or was arbitrarily assigned. For each sense that has appeared in
such texts, the number of semantic tags to that sense are indicated in
parentheses after the sense number. <P>
If a search cannot be performed on
some senses of <I>searchstr </I>, the search results are headed by a string of
the form: <tt> </tt>&nbsp;<tt> </tt>&nbsp;X of Y senses of <I>searchstr </I> <BR>
<P>
The output of the <B>-deri </B> search
shows word forms that are morphologically related to <B>searchstr </B>. Each word
form pointed to from <I>searchstr </I> is displayed, preceded by <B>RELATED TO-&gt; </B>
and the syntactic category of the link, followed, on the next line, by
its synset. Printed after the word form is <B># </B><I>n </I> where <I>n </I> indicates the
WordNet sense number of the term pointed to. <P>
The <B>-domn </B> and <B>-domt </B> searches
show the domain that a synset has been classified in and, conversely,
all of the terms that have been assigned to a specific domain. A domain
is either a <B>TOPIC, </B> <B>REGION </B> or <B>USAGE, </B> as reflected in the specific pointer
character stored in the database, and displayed in the output. A <B>-domn
</B> search on a term shows the domain, if any, that each synset containing
<I>searchstr </I> has been classified in. The output display shows the domain
type (<B>TOPIC, </B> <B>REGION </B> or <B>USAGE </B>), followed by the syntactic category of
the domain synset and the terms in the synset. Each term is followed by
<B># </B><I>n </I> where <I>n </I> indicates the WordNet sense number of the term. The converse
search, <B>-domt </B>, shows all of the synsets that have been placed into the
domain <I>searchstr </I>, with analogous markers. <P>
When <B>-framv </B> is specified,
sample illustrative sentences and generic sentence frames are displayed.
If a sample sentence is found, the base form of <I>search </I> is substituted
into the sentence, and it is printed below the synset, preceded with the
<B>EX: </B> marker. When no sample sentences are found, the generic sentence
frames are displayed. Sentence frames that are acceptable for all words
in a synset are preceded by the marker <B>*&gt; </B>. If a frame is acceptable for
the search word only, it is preceded by the marker <B>=&gt; </B>. <P>
Search results
for adjectives are slightly different from those for other parts of speech.
When an adjective is printed, its direct antonym, if it has one, is also
printed in parentheses. When <I>searchstr </I> is in a head synset, all of the
head synset's satellites are also displayed. The position of an adjective
in relation to the noun may be restricted to the <I>prenominal </I>, <I>postnominal
</I> or <I>predicative </I> position. Where present, these restrictions are noted
in parentheses. <P>
When an adjective is a participle of a verb, the output
indicates the verb and displays its synset. <P>
When an adverb is derived
from an adjective, the specific adjectival sense on which it is based
is indicated. <P>
The morphological transformations performed by the search
code may result in more than one word to search for. WordNet automatically
performs the requested search on all of the strings and returns the results
grouped by word. For example, the verb <B>saw </B> is both the present tense
of <B>saw </B> and the past tense of <B>see </B>. When passed <I>searchstr </I> <B>saw </B>, WordNet
performs the desired search first on <B>saw </B> and next on <B>see </B>, returning
the list of <B>saw </B> senses and search results, followed by those for <B>see
</B>. </blockquote>
<H2><A NAME="sect6" HREF="#toc6">EXIT STATUS </A></H2>
<B>wn() </B> normally exits with the number of senses displayed.
If <I>searchword </I> is not found in WordNet, it exits with <B>0 </B>. <P>
If the WordNet
database cannot be opened, an error messages is displayed and <B>wn() </B> exits
with <B>-1 </B>.
<H2><A NAME="sect7" HREF="#toc7">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet.
Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the
WordNet database has been installed. Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">REGISTRY
(WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory for
WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect9" HREF="#toc9">FILES </A></H2>
<DL>
<DT><B>index.<I>pos </I></B> </DT>
<DD>database
index files </DD>
<DT><B>data.<I>pos </I></B> </DT>
<DD>database data files </DD>
<DT><B>*.vrb</B> </DT>
<DD>files of sentences illustrating
the use of verbs </DD>
<DT><B><I>pos </I>.exc</B> </DT>
<DD>morphology exception lists </DD>
</DL>
<H2><A NAME="sect10" HREF="#toc10">SEE ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
,
<B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
,<B></B> <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
,
<B><A HREF="morphy.7WN.html">morphy</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
, <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
.
<H2><A NAME="sect11" HREF="#toc11">BUGS </A></H2>
Please report bugs to wordnet@princeton.edu.
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">OPTIONS</A></LI>
<UL>
<LI><A NAME="toc4" HREF="#sect4">Search Options</A></LI>
</UL>
<LI><A NAME="toc5" HREF="#sect5">SEARCH RESULTS</A></LI>
<LI><A NAME="toc6" HREF="#sect6">EXIT STATUS</A></LI>
<LI><A NAME="toc7" HREF="#sect7">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc8" HREF="#sect8">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc9" HREF="#sect9">FILES</A></LI>
<LI><A NAME="toc10" HREF="#sect10">SEE ALSO</A></LI>
<LI><A NAME="toc11" HREF="#sect11">BUGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,524 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNB(1WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnb - WordNet window-based browser interface
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>wnb </B>
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION
</A></H2>
<B>wnb() </B> provides a window-based interface for browsing the WordNet database,
allowing synsets and relations to be displayed as formatted text. For
each search word, different searches are available based on syntactic
category and information available in the database. <P>
<B>wnb </B> is written
in Tcl/Tk, which is available for Unix and Windows platforms. This allows
the same code to work on all supported WordNet platforms without modification.
<H2><A NAME="sect3" HREF="#toc3">WNB WINDOWS </A></H2>
<B>wnb() </B> was developed with the philosophy that only those
searches and buttons that are applicable at the current time are displayed.
As a result, the appearance of the interface changes as it is used. Use
the standard windowing system mouse functions to open and close the WordNet
Browser Window, move the window, and change its size. <P>
The WordNet Browser
Window contains the following areas, from top to bottom:
<DL>
<DT>Menubar </DT>
<DD>A menubar
runs along the top of the browser window with pulldown menus and button
entitled <B>File </B>, <B>History </B>, <B>Options </B>, and <B>Help </B>. </DD>
<DT>Search Word Entry </DT>
<DD>Below
the Menubar is a line for entering the search word. A search word can
be a single word, hyphenated string, or a collocation. Case is ignored.
Although only uninflected forms of words are usually stored in WordNet,
users may search for inflected forms. WordNet's morphological processor
finds the base form automatically. </DD>
<DT>Search Selection </DT>
<DD>Below the Search Word
Entry line is an area for selecting the search type and senses to search.
Until a search word is entered this area is blank. After a search word
is entered, buttons appear corresponding to each syntactic category (<B>Noun
</B>, <B>Verb </B>, <B>Adjective </B>, <B>Adverb </B>) in which the search string is defined in
WordNet. </DD>
</DL>
<P>
At the right edge of the Search Selection line is a box for
entering sense numbers. When this box is empty, search results for all
senses of the search word that match the search type are displayed. The
search may be restricted to one or more specific senses by entering a
comma or space separated list of sense numbers in the <B>Senses </B> box. These
sense numbers remain in effect until either the user changes or deletes
them, or a new search word is entered.
<DL>
<DT>Results Window </DT>
<DD>Most of the browser
window consists of a large text buffer for displaying the results of WordNet
searches. Horizontal and vertical scroll bars are present for scrolling
through the output. </DD>
<DT>Status Line </DT>
<DD>A status line is at the bottom of the
browser window. When search results are displayed in the Results Window,
this status line reflects the type of search selected. When there is no
search word entered, your are prompted to <B>"Enter search word and press
return." </B> If the search word entered is not in WordNet, the message <B>"Sorry,
no matches found." </B> is displayed. </DD>
</DL>
<H2><A NAME="sect4" HREF="#toc4">SEARCHING THE DATABASE </A></H2>
The WordNet browser
navigates through WordNet in two steps. First a search word is entered
and an overview of all the senses of the word in all syntactic categories
is displayed in the Results Window. The senses are grouped by syntactic
category, and each synset is annotated as described above with <I>synset_offset
</I>, <I>lex_filename </I>, and <I>sense_number </I> as dictated by the advanced search
options set. The overview search also indicates how many of the senses
in each syntactic category are represented in the tagged texts. This is
a way for the user to determine whether a sense's sense number is based
on semantic tagging data, or was arbitrarily assigned. For each sense
that has appeared in such texts, the number of semantic tags to that sense
are indicated in parentheses after the sense number. <P>
Then, within a syntactic
category, a specific search is selected. The desired search is performed
and the search results are displayed in the Results Window. Additional
searches on the same word can be performed, or a new search word can be
entered. <P>
To enter a search word, click the mouse in the horizontal box
labeled <B>Search Word </B>, type a single word, hyphenated string, or collocation
and press <FONT SIZE=-1><B>RETURN. </B></FONT>
<P>
<B>wnb() </B> responds by making a set of Part of Speech
buttons appear in the Search Selection line. Each button corresponds to
a syntactic category in which the search string is defined in WordNet.
At the same time, an Overview of the synsets for all senses of the search
word is displayed in the Results Window. The Overview includes the gloss
for each synset and also indicates which of the senses have appeared in
the semantically tagged texts. For each sense that has appeared in such
texts, the number of semantic tags to that sense are indicated in parentheses
after the sense number. <P>
The pulldown menus in the Search Selection line
list all of the WordNet searches that can be performed for the search
word in that part of speech. To select a search, highlight it by dragging
the mouse to it, and release the mouse while it is highlighted. Drag the
mouse outside of the pulldown list and release to hide the menu without
making a selection. Dragging the mouse across the Part of Speech buttons
displays the available searches for each syntactic category. <P>
To restrict
a search to one or more senses within a syntactic category, enter a comma
or space separated list of sense numbers in the <B>Senses </B> box before selecting
a search. <P>
After a search is selected, <B>wnb() </B> performs the search on the
WordNet database and displays the formatted results in the Results Window.
Whenever search results are displayed, a button entitled <B>Redisplay Overview
</B> is present at the right edge of the Search Word Entry line. Clicking
on this button redisplays the Overview of all synsets for the search word
in the Results Window.
<H3><A NAME="sect5" HREF="#toc5">Changing the Search Word </A></H3>
A new search word can
be entered at any time by moving to the Search Word Entry box, if necessary
highlighting it by clicking, erasing the old string, typing a new one
and pressing <FONT SIZE=-1><B>RETURN. </B></FONT>
The <B>Senses </B> box is cleared if necessary, the Part
of Speech buttons applicable to the new search word appear, and the Overview
for the new search word is displayed. <P>
The middle mouse button can also
be used to select a new search word by placing the mouse over any word
in the Results Window and clicking. The selected word will replace the
text in the Search Word Entry box, and the overview for that word will
automatically be displayed. <P>
To select a new search string collocation
from text in the Results Window, highlight the text with the mouse and
press <FONT SIZE=-1><B>CONTROL-S. </B></FONT>
<P>
<H3><A NAME="sect6" HREF="#toc6">Interrupting a Search </A></H3>
When a search is in progress
the message <B>"Searching...(press escape to abort)" </B> is displayed in the Status
Line. Note that most searches return very quickly, so this message isn't
noticeable. As indicated, pressing the <FONT SIZE=-1><B>ESCAPE </B></FONT>
key will interrupt the
search. The results of the search obtained before the time the search
was interrupted are displayed in the Results Window.
<H2><A NAME="sect7" HREF="#toc7">MENUS </A></H2>
<H3><A NAME="sect8" HREF="#toc8">File Menu
</A></H3>
<blockquote>
<DL>
<DT>Find keywords by substring </DT>
<DD>Display a popup window for specifying a search
of WordNet for words or collocations that contain a specific substring.
If a search word is currently entered in the <B>Search Word </B> box, it is
used as the substring to search for by default. The Substring Search Window
contains a box for entering a substring, a pulldown menu to its right
for specifying the part of speech to search, a large area for displaying
the search results, and action buttons at the bottom entitled <B>Search </B>,
<B>Save </B>, <B>Print </B> <B>Dismiss </B>. </DD>
</DL>
<P>
Once a substring is entered and a part of speech
selected, clicking on the <B>Search </B> button causes a search to be done for
all words and collocations in WordNet, in that syntactic category, that
contain the substring according to the following criteria: <P>
1. The substring
can appear at the beginning or end of a word, hyphenated string o collocation.
<P>
2. The substring can appear in the middle of a hyphenated string or collocation,
but only delimited on both sides by spaces or hyphens. <P>
The search results
are displayed in the large buffer. Clicking on an item from the search
results list causes <B>wnb() </B> to automatically enter that word in the <B>Search
Word </B> box of the WordNet Browser Window and perform the Overview search.
<P>
Clicking the <B>Save </B> button generates a popup dialog for specifying a filename
to save the substring search results to. Clicking the <B>Print </B> button generates
a popup dialog in which a print command can be specified. <P>
Selecting <B>Dismiss
</B> closes the Substring Search Window.
<DL>
<DT>Save current display </DT>
<DD>Display a popup
dialog for specifying a filename to save the current Results Window contents
to. </DD>
<DT>Print current display </DT>
<DD>Display a popup dialog in which to specify a
print command to which the current Results Window contents can be piped.
Note - this option does not exist in the Windows version. </DD>
<DT>Clear current
display </DT>
<DD>Clear the <B>Search Word </B> and <B>Senses </B> boxes, and Results Window. </DD>
<DT>Exit
</DT>
<DD>Does what you would expect. </DD>
</DL>
</blockquote>
<H3><A NAME="sect9" HREF="#toc9">History </A></H3>
This pulldown menu contains a list
of the last searches performed. Selecting an item from this list performs
that search again. The maximum number of searches stored in the list can
be adjusted from the <B>Options </B> menu. The default is 10.
<H3><A NAME="sect10" HREF="#toc10">Options </A></H3>
<blockquote>
<DL>
<DT>Show help
with each search </DT>
<DD>When this checkbox is selected search results are preceded
by some explanatory text about the type of search selected. This is off
by default. </DD>
<DT>Show descriptive gloss </DT>
<DD>When this checkbox is selected, synset
glosses are displayed in all search results. This is set by default. Note
that glosses are always displayed in the Overview. </DD>
<DT>Wrap Lines </DT>
<DD>When this
checkbox is selected, lines in the Results Window that are wider than
the window are automatically wrapped. This is set by default. If not selected,
a horizontal scroll bar is present if any lines are longer than the width
of the window. </DD>
<DT>Set advanced search options... </DT>
<DD>Selecting this item displays
a popup window for setting the following search options: <B>Lexical file
information; Synset location in database file; Sense number </B>. Choices
for each are: </DD>
</DL>
<P>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<B>Don't show </B> (default) <BR>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<B>Show with searches </B> <BR>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<B>Show with searches
and overview </B> <BR>
<P>
When lexical file information is shown, the name of the
lexicographer file is printed before each synset, enclosed in angle brackets
(<B>&lt;&nbsp;&nbsp;<I>...<B>&nbsp;&nbsp;&gt; </B></I></B>). When both lexical file information and synset location information
are displayed, the synset location information appears first. If within
one lexicographer file more than one sense of a word is entered, an integer
<I>lex_id </I> is appended onto all but one of the word's instances to uniquely
identify it. In each synset, each word having a non-zero <I>lex_id </I> is printed
with the <I>lex_id </I> value printed immediately following the word. If both
lexicographer information and sense numbers are displayed, <I>lex_id </I>s, if
present, precede sense numbers. <P>
When synset location is shown, the byte
offset of the synset in the database "data" file corresponding to the
syntactic category of the synset is printed before each synset, enclosed
in curly braces (<B>{&nbsp;&nbsp;<I>...<B>&nbsp;&nbsp;} </B></I></B>). When both lexical file information and synset
location information are displayed, the synset location information appears
first. <P>
When sense numbers are shown, the sense number of each word in
each synset is printed immediately after the word, and is preceded by
a number sign (<B># </B>).
<DL>
<DT>Set maximum history length... </DT>
<DD>Display a popup dialog in
which the maximum number of previous searches to be kept on the History
list can be set. </DD>
<DT>Set font...&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </DT>
<DD>Display a popup window for setting
the font (typeface) and font size to use for the Results Window. Choices
for typeface are: <B>Courier </B>, <B>Helvetica </B>, and <B>Times </B> (default). Font size
can be <B>small </B>, <B>medium </B> (default), or <B>large </B>. </DD>
<DT>Save current options as default
</DT>
<DD>Save the currently set options. Next time the browser is started, these
options will be used as the user defaults. </DD>
</DL>
</blockquote>
<H3><A NAME="sect11" HREF="#toc11">Help </A></H3>
<blockquote>
<DL>
<DT>Help on using the WordNet
browser </DT>
<DD>Display this manual page. </DD>
<DT>Help on WordNet terminology </DT>
<DD>Display the
<B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
manual page. </DD>
<DT>Display the WordNet license </DT>
<DD>Display the WordNet
copyright notice and license agreement. </DD>
<DT>About the WordNet browser </DT>
<DD>Information
about this application. </DD>
</DL>
</blockquote>
<H2><A NAME="sect12" HREF="#toc12">SHORCUTS </A></H2>
Clicking on any word in the Results
Window while holding down the <FONT SIZE=-1><B>SHIFT </B></FONT>
key on the keyboard causes the
browser to replace <B>Search Word </B> with the word and display its Overview
and available searches. Clicking on any word in the Results Window with
the middle mouse button does the same thing. <P>
Pressing the <FONT SIZE=-1><B>CONTROL-S </B></FONT>
keys
causes the browser to do as above on the text that is currently highlighted.
Under Unix, this will work even if the highlighted text is in another
window. This works on hyphenated strings and collocations, as well as
individual words. <P>
Pressing the <FONT SIZE=-1><B>CONTROL-G </B></FONT>
keys displays the Substring
Search Window. <P>
<H2><A NAME="sect13" HREF="#toc13">SEARCH RESULTS </A></H2>
The results of a search of the WordNet
database are displayed in the Results Window. Horizontal and vertical
scroll bars are present for scrolling through the search results. <P>
All
searches other than the Overview list all senses matching the search results
in the following general format. Items enclosed in italicized square brackets
(<I>[&nbsp;...&nbsp;] </I>) may not be present. <P>
If a search cannot be performed on some senses
of <I>searchstr </I>, the search results are headed by a string of the form:
<tt> </tt>&nbsp;<tt> </tt>&nbsp;X of Y senses of <I>searchstr </I> <BR>
<P>
<blockquote>One line listing the number of senses matching
the search selected. <P>
Each sense matching the search selected displayed
as follows: <P>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<B>Sense <I>n </I></B> <BR>
<tt> </tt>&nbsp;<tt> </tt>&nbsp;<I>[<B>{<I>synset_offset<B>}<I>] [<B>&lt;<I>lex_filename<B>&gt;<I>]&nbsp;&nbsp;word1[<B>#<I>sense_number][,&nbsp;&nbsp;word2...]
</I></B></I></B></I></B></I></B></I></B></I> <BR>
<P>
Where <I>n </I> is the sense number of the search word, <I>synset_offset </I> is
the byte offset of the synset in the <B>data.<I>pos </I></B> file corresponding to the
syntactic category, <I>lex_filename </I> is the name of the lexicographer file
that the synset comes from, <I>word1 </I> is the first word in the synset (note
that this is not necessarily the search word) and <I>sense_number </I> is the
WordNet sense number assigned to the preceding word. <I>synset_offset </I>, <I>lex_filename
</I>, and <I>sense_number </I> are generated if the appropriate Options are specified.
<P>
The synsets matching the search selected are printed below each sense's
synset output described above. Each line of output is preceded by a marker
(usually <B>=&gt; </B>), then a synset, formatted as described above. If a search
traverses more one level of the tree, then successive lines are indented
by spaces corresponding to its level in the hierarchy. Glosses are displayed
in parentheses at the end of each synset if the appropriate Option is
set. Each synset is printed on one line. <P>
Senses are ordered from most
to least frequently used, with the most common sense numbered <B>1 </B>. Frequency
of use is determined by the number of times a sense is tagged in the various
semantic concordance texts. Senses that are not semantically tagged follow
the ordered senses. Note that this ordering is only an estimate based on
usage in a small corpus. <P>
Verb senses can be grouped by similarity of meaning,
rather than ordered by frequency of use. When the <B>"Synonyms, grouped by
similarity" </B> search is selected, senses that are close in meaning are
printed together, with a line of dashes indicating the end of a group.
See <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
for a discussion how senses are grouped. <P>
The output
of the <B>"Derivationally Related Forms" </B> search shows word forms that are
morphologically related to <B>searchstr </B>. Each word form pointed to from <I>searchstr
</I> is displayed, preceded by <B>RELATED TO-&gt; </B> and the syntactic category of the
link, followed, on the next line, by its synset. Printed after the word
form is <B># </B><I>n </I> where <I>n </I> indicates the WordNet sense number of the term pointed
to. <P>
The <B>"Domain" </B> and <B>"Domain Terms" </B> searches show the domain that a
synset has been classified in and, conversely, all of the terms that have
been assigned to a specific domain. A domain is either a <B>TOPIC, </B> <B>REGION
</B> or <B>USAGE, </B> as reflected in the specific pointer character stored in the
database, and displayed in the output. A <B>Domain </B> search on a term shows
the domain, if any, that each synset containing <I>searchstr </I> has been classified
in. The output display shows the domain type (<B>TOPIC, </B> <B>REGION </B> or <B>USAGE
</B>), followed by the syntactic category of the domain synset and the terms
in the synset. Each term is followed by <B># </B><I>n </I> where <I>n </I> indicates the WordNet
sense number of the term. The converse search, <B>Domain Terms </B>, shows all
of the synsets that have been placed into the domain <I>searchstr </I>, with
analogous markers. <P>
When the <B>"Sentence Frames" </B> search is specified, sample
illustrative sentences and generic sentence frames are displayed. If a
sample sentence is found, the base form of the search word is substituted
into the sentence, and it is printed below the synset, preceded with the
<B>EX: </B> marker. When no sample sentences are found, the generic sentence
frames are displayed. Sentence frames that are acceptable for all words
in a synset are preceded by the marker <B>*&gt; </B>. If a frame is acceptable for
the search word only, it is preceded by the marker <B>=&gt; </B>. <P>
Search results
for adjectives are slightly different from those for other parts of speech.
When an adjective is printed, its direct antonym, if it has one, is also
printed in parentheses. When the search word is in a head synset, all
of the head synset's satellites are also displayed. The position of an
adjective in relation to the noun may be restricted to the <I>prenominal
</I>, <I>postnominal </I> or <I>predicative </I> position. Where present, these restrictions
are noted in parentheses. <P>
When an adjective is a participle of a verb,
the output indicates the verb and displays its synset. <P>
When an adverb
is derived from an adjective, the specific adjectival sense on which it
is based is indicated. <P>
The morphological transformations performed by
the search code may result in more than one word to search for. <B>wnb()
</B> automatically performs the requested search on all of the strings and
returns the results grouped by word. For example, the verb <B>saw </B> is both
the present tense of <B>saw </B> and the past tense of <B>see </B>. When there is more
than one word to search for, search results are grouped by word. </blockquote>
<H2><A NAME="sect14" HREF="#toc14">DIAGNOSTICS
</A></H2>
If the WordNet database files cannot be opened, error messages are displayed.
This is usually corrected by setting the environment variables described
below to the proper location of the WordNet database for your installation.
<H2><A NAME="sect15" HREF="#toc15">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet. Default
is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the WordNet database
has been installed. Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect16" HREF="#toc16">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B>
</DT>
<DD>Base directory for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
<DT><B>HKEY_CURRENT_USER\SOFTWARE\WordNet\3.0\wnres</B>
</DT>
<DD>User's default browser options. </DD>
</DL>
<H2><A NAME="sect17" HREF="#toc17">FILES </A></H2>
<DL>
<DT><B>index.<I>pos </I></B> </DT>
<DD>database index files
</DD>
<DT><B>data.<I>pos </I></B> </DT>
<DD>database data files </DD>
<DT><B>*.vrb</B> </DT>
<DD>files of sentences illustrating the
use of verbs </DD>
<DT><B><I>pos </I>.exc</B> </DT>
<DD>morphology exception lists </DD>
</DL>
<H2><A NAME="sect18" HREF="#toc18">SEE ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
,
<B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
,<B></B> <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
,
<B><A HREF="morphy.7WN.html">morphy</B>(7WN)</A>
,<B></B> <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
, <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
.
<H2><A NAME="sect19" HREF="#toc19">BUGS </A></H2>
Please reports bugs to
wordnet@princeton.edu. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">WNB WINDOWS</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEARCHING THE DATABASE</A></LI>
<UL>
<LI><A NAME="toc5" HREF="#sect5">Changing the Search Word</A></LI>
<LI><A NAME="toc6" HREF="#sect6">Interrupting a Search</A></LI>
</UL>
<LI><A NAME="toc7" HREF="#sect7">MENUS</A></LI>
<UL>
<LI><A NAME="toc8" HREF="#sect8">File Menu</A></LI>
<LI><A NAME="toc9" HREF="#sect9">History</A></LI>
<LI><A NAME="toc10" HREF="#sect10">Options</A></LI>
<LI><A NAME="toc11" HREF="#sect11">Help</A></LI>
</UL>
<LI><A NAME="toc12" HREF="#sect12">SHORCUTS</A></LI>
<LI><A NAME="toc13" HREF="#sect13">SEARCH RESULTS</A></LI>
<LI><A NAME="toc14" HREF="#sect14">DIAGNOSTICS</A></LI>
<LI><A NAME="toc15" HREF="#sect15">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc16" HREF="#sect16">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc17" HREF="#sect17">FILES</A></LI>
<LI><A NAME="toc18" HREF="#sect18">SEE ALSO</A></LI>
<LI><A NAME="toc19" HREF="#sect19">BUGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,398 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNDB(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
index.noun, data.noun, index.verb, data.verb, index.adj, data.adj, index.adv,
data.adv - WordNet database files <P>
noun.exc, verb.exc. adj.exc adv.exc - morphology
exception lists <P>
sentidx.vrb, sents.vrb - files used by search code to display
sentences illustrating the use of some specific verbs
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
For
each syntactic category, two files are needed to represent the contents
of the WordNet database - <B>index. </B><I>pos </I> and <B>data. </B><I>pos </I>, where <I>pos </I> is <B>noun
</B>, <B>verb </B>, <B>adj </B> and <B>adv </B>. The other auxiliary files are used by the WordNet
library's searching functions and are needed to run the various WordNet
browsers. <P>
Each index file is an alphabetized list of all the words found
in WordNet in the corresponding part of speech. On each line, following
the word, is a list of byte offsets (<I>synset_offset </I>s) in the corresponding
data file, one for each synset containing the word. Words in the index
file are in lower case only, regardless of how they were entered in the
lexicographer files. This folds various orthographic representations of
the word into one line enabling database searches to be case insensitive.
See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a detailed description of the lexicographer files
<P>
A data file for a syntactic category contains information corresponding
to the synsets that were specified in the lexicographer files, with relational
pointers resolved to <I>synset_offset </I>s. Each line corresponds to a synset.
Pointers are followed and hierarchies traversed by moving from one synset
to another via the <I>synset_offset </I>s. <P>
The exception list files, <I>pos </I><B>.exc
</B>, are used to help the morphological processor find base forms from irregular
inflections. <P>
The files <B>sentidx.vrb </B> and <B>sents.vrb </B> contain sentences illustrating
the use of specific senses of some verbs. These files are used by the
searching software in response to a request for verb sentence frames.
Generic sentence frames are displayed when an illustrative sentence is
not present. <P>
The various database files are in ASCII formats that are
easily read by both humans and machines. All fields, unless otherwise
noted, are separated by one space character, and all lines are terminated
by a newline character. Fields enclosed in italicized square brackets
may not be present. <P>
See <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
for a glossary of WordNet terminology
and a discussion of the database's content and logical organization.
<H3><A NAME="sect2" HREF="#toc2">Index
File Format </A></H3>
Each index file begins with several lines containing a copyright
notice, version number and license agreement. These lines all begin with
two spaces and the line number so they do not interfere with the binary
search algorithm that is used to look up entries in the index files. All
other lines are in the following format. In the field descriptions, <B>number
</B> always refers to a decimal integer unless otherwise defined. <P>
<I>lemma&nbsp;&nbsp;pos&nbsp;&nbsp;synset_cnt&nbsp;&nbsp;p_cnt&nbsp;&nbsp;[ptr_symbol...]&nbsp;&nbsp;sense_cnt&nbsp;&nbsp;tagsense_cnt
&nbsp;&nbsp;synset_offset&nbsp;&nbsp;[synset_offset...] </I> <BR>
<P>
<DL>
<DT><I>lemma</I> </DT>
<DD>lower case ASCII text of word
or collocation. Collocations are formed by joining individual words with
an underscore (<B>_ </B>) character. </DD>
<DT><I>pos</I> </DT>
<DD>Syntactic category: <B>n </B> for noun files,
<B>v </B> for verb files, <B>a </B> for adjective files, <B>r </B> for adverb files. </DD>
</DL>
<P>
<P>
All remaining
fields are with respect to senses of <I>lemma </I> in <I>pos </I>. <P>
<DL>
<DT><I>synset_cnt</I> </DT>
<DD>Number
of synsets that <I>lemma </I> is in. This is the number of senses of the word
in WordNet. See <FONT SIZE=-1><B>Sense Numbers </B></FONT>
below for a discussion of how sense numbers
are assigned and the order of <I>synset_offset </I>s in the index files. </DD>
<DT><I>p_cnt</I>
</DT>
<DD>Number of different pointers that <I>lemma </I> has in all synsets containing
it. </DD>
<DT><I>ptr_symbol</I> </DT>
<DD>A space separated list of <I>p_cnt </I> different types of pointers
that <I>lemma </I> has in all synsets containing it. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a list
of <I>pointer_symbol </I>s. If all senses of <I>lemma </I> have no pointers, this field
is omitted and <I>p_cnt </I> is <B>0 </B>. </DD>
<DT><I>sense_cnt</I> </DT>
<DD>Same as <I>sense_cnt </I> above. This
is redundant, but the field was preserved for compatibility reasons. </DD>
<DT><I>tagsense_cnt</I>
</DT>
<DD>Number of senses of <I>lemma </I> that are ranked according to their frequency
of occurrence in semantic concordance texts. </DD>
<DT><I>synset_offset</I> </DT>
<DD>Byte offset
in <B>data.<I>pos </I></B> file of a synset containing <I>lemma </I>. Each <I>synset_offset </I> in
the list corresponds to a different sense of <I>lemma </I> in WordNet. <I>synset_offset
</I> is an 8 digit, zero-filled decimal integer that can be used with <B><A HREF="fseek.3.html">fseek</B>(3)</A>
to read a synset from the data file. When passed to <B><A HREF="read_synset.3WN.html">read_synset</B>(3WN)</A>
along
with the syntactic category, a data structure containing the parsed synset
is returned. </DD>
</DL>
<H3><A NAME="sect3" HREF="#toc3">Data File Format </A></H3>
Each data file begins with several lines
containing a copyright notice, version number and license agreement. These
lines all begin with two spaces and the line number. All other lines are
in the following format. Integer fields are of fixed length, and are zero-filled.
<P>
<I>synset_offset&nbsp;&nbsp;lex_filenum&nbsp;&nbsp;ss_type&nbsp;&nbsp;w_cnt&nbsp;&nbsp;word&nbsp;&nbsp;lex_id&nbsp;&nbsp;[word&nbsp;&nbsp;lex_id...]&nbsp;&nbsp;p_cnt&nbsp;&nbsp;[ptr...]&nbsp;&nbsp;[frames...]&nbsp;&nbsp;<B>|
</B></I><I>&nbsp;&nbsp;gloss </I> <BR>
<P>
<DL>
<DT><I>synset_offset</I> </DT>
<DD>Current byte offset in the file represented
as an 8 digit decimal integer. </DD>
<DT><I>lex_filenum</I> </DT>
<DD>Two digit decimal integer
corresponding to the lexicographer file name containing the synset. See
<B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
for the list of filenames and their corresponding numbers.
</DD>
<DT><I>ss_type</I> </DT>
<DD>One character code indicating the synset type: </DD>
</DL>
<P>
<blockquote><B>n </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>v </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB
<BR>
<B>a </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>s </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE SATELLITE <BR>
<B>r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
</blockquote>
<DL>
<DT><I>w_cnt</I> </DT>
<DD>Two digit hexadecimal
integer indicating the number of words in the synset. </DD>
<DT><I>word</I> </DT>
<DD>ASCII form
of a word as entered in the synset by the lexicographer, with spaces replaced
by underscore characters (<B>_ </B>). The text of the word is case sensitive,
in contrast to its form in the corresponding <B>index. </B><I>pos </I> file, that contains
only lower-case forms. In <B>data.adj </B>, a <I>word </I> is followed by a syntactic
marker if one was specified in the lexicographer file. A syntactic marker
is appended, in parentheses, onto <I>word </I> without any intervening spaces.
See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a list of the syntactic markers for adjectives. </DD>
<DT><I>lex_id</I>
</DT>
<DD>One digit hexadecimal integer that, when appended onto <I>lemma </I>, uniquely
identifies a sense within a lexicographer file. <I>lex_id </I> numbers usually
start with <B>0 </B>, and are incremented as additional senses of the word are
added to the same file, although there is no requirement that the numbers
be consecutive or begin with <B>0 </B>. Note that a value of <B>0 </B> is the default,
and therefore is not present in lexicographer files. </DD>
<DT><I>p_cnt</I> </DT>
<DD>Three digit
decimal integer indicating the number of pointers from this synset to
other synsets. If <I>p_cnt </I> is <B>000 </B> the synset has no pointers. </DD>
<DT><I>ptr</I> </DT>
<DD>A pointer
from this synset to another. <I>ptr </I> is of the form: </DD>
</DL>
<P>
<I>pointer_symbol&nbsp;&nbsp;synset_offset&nbsp;&nbsp;pos&nbsp;&nbsp;source/target
</I> <BR>
<P>
where <I>synset_offset </I> is the byte offset of the target synset in the
data file corresponding to <I>pos </I>. <P>
The <I>source/target </I> field distinguishes
lexical and semantic pointers. It is a four byte field, containing two
two-digit hexadecimal integers. The first two digits indicates the word
number in the current (source) synset, the last two digits indicate the
word number in the target synset. A value of <B>0000 </B> means that <I>pointer_symbol
</I> represents a semantic relation between the current (source) synset and
the target synset indicated by <I>synset_offset </I>. <P>
A lexical relation between
two words in different synsets is represented by non-zero values in the
source and target word numbers. The first and last two bytes of this field
indicate the word numbers in the source and target synsets, respectively,
between which the relation holds. Word numbers are assigned to the <I>word
</I> fields in a synset, from left to right, beginning with <B>1 </B>. <P>
See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for a list of <I>pointer_symbol </I>s, and semantic and lexical pointer classifications.
<DL>
<DT><I>frames</I> </DT>
<DD>In <B>data.verb </B> only, a list of numbers corresponding to the generic
verb sentence frames for <I>word </I>s in the synset. <I>frames </I> is of the form:
</DD>
</DL>
<P>
<I>f_cnt&nbsp;&nbsp; </I> <B>+ </B> <I>&nbsp;&nbsp;f_num&nbsp;&nbsp;w_num&nbsp;&nbsp;[ </I> <B>+ </B> <I>&nbsp;&nbsp;f_num&nbsp;&nbsp;w_num...] </I> <BR>
<P>
where <I>f_cnt </I> a two
digit decimal integer indicating the number of generic frames listed,
<I>f_num </I> is a two digit decimal integer frame number, and <I>w_num </I> is a two
digit hexadecimal integer indicating the word in the synset that the frame
applies to. As with pointers, if this number is <B>00 </B>, <I>f_num </I> applies to
all <I>word </I>s in the synset. If non-zero, it is applicable only to the word
indicated. Word numbers are assigned as described for pointers. Each <I>f_num&nbsp;&nbsp;w_num
</I> pair is preceded by a <B>+ </B>. See <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
for the text of the generic
sentence frames.
<DL>
<DT><I>gloss</I> </DT>
<DD>Each synset contains a gloss. A <I>gloss </I> is represented
as a vertical bar (<B>| </B>), followed by a text string that continues until
the end of the line. The gloss may contain a definition, one or more example
sentences, or both. </DD>
</DL>
<H3><A NAME="sect4" HREF="#toc4">Sense Numbers </A></H3>
Senses in WordNet are generally ordered
from most to least frequently used, with the most common sense numbered
<B>1 </B>. Frequency of use is determined by the number of times a sense is tagged
in the various semantic concordance texts. Senses that are not semantically
tagged follow the ordered senses. The <I>tagsense_cnt </I> field for each entry
in the <B>index.<I>pos </I></B> files indicates how many of the senses in the list have
been tagged. <P>
The <B><A HREF="cntlist.5WN.html">cntlist</B>(5WN)</A>
file provided with the database lists the
number of times each sense is tagged in the semantic concordances. The
data from <B>cntlist </B> is used by <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
to order the senses of each word.
When the <B>index </B>.<I>pos </I> files are generated, the <I>synset_offset </I>s are output
in sense number order, with sense 1 first in the list. Senses with the
same number of semantic tags are assigned unique but consecutive sense
numbers. The WordNet <FONT SIZE=-1><B>OVERVIEW </B></FONT>
search displays all senses of the specified
word, in all syntactic categories, and indicates which of the senses are
represented in the semantically tagged texts.
<H3><A NAME="sect5" HREF="#toc5">Exception List File Format
</A></H3>
Exception lists are alphabetized lists of inflected forms of words and
their base forms. The first field of each line is an inflected form, followed
by a space separated list of one or more base forms of the word. There
is one exception list file for each syntactic category. <P>
Note that the
noun and verb exception lists were automatically generated from a machine-readable
dictionary, and contain many words that are not in WordNet. Also, for
many of the inflected forms, base forms could be easily derived using
the standard rules of detachment programmed into Morphy (See <B><A HREF="morph.7WN.html">morph</B>(7WN)</A>
).
These anomalies are allowed to remain in the exception list files, as
they do no harm. <P>
<H3><A NAME="sect6" HREF="#toc6">Verb Example Sentences </A></H3>
For some verb senses, example
sentences illustrating the use of the verb sense can be displayed. Each
line of the file <B>sentidx.vrb </B> contains a <I>sense_key </I> followed by a space
and a comma separated list of example sentence template numbers, in decimal.
The file <B>sents.vrb </B> lists all of the example sentence templates. Each
line begins with the template number followed by a space. The rest of
the line is the text of a template example sentence, with <B>%s </B> used as
a placeholder in the text for the verb. Both files are sorted alphabetically
so that the <I>sense_key </I> and template sentence number can be used as indices,
via <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
,<B></B> into the appropriate file. <P>
When a request for <FONT SIZE=-1><B>FRAMES
</B></FONT>
is made, the WordNet search code looks for the sense in <B>sentidx.vrb </B>.
If found, the sentence template(s) listed is retrieved from <B>sents.vrb
</B>, and the <B>%s </B> is replaced with the verb. If the sense is not found, the
applicable generic sentence frame(s) listed in <I>frames </I> is displayed.
<H2><A NAME="sect7" HREF="#toc7">NOTES
</A></H2>
Information in the <B>data.<I>pos </I></B> and <B>index.<I>pos </I></B> files represents all of the
word senses and synsets in the WordNet database. The <I>word </I>, <I>lex_id </I>, and
<I>lex_filenum </I> fields together uniquely identify each word sense in WordNet.
These can be encoded in a <I>sense_key </I> as described in <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
. Each
synset in the database can be uniquely identified by combining the <I>synset_offset
</I> for the synset with a code for the syntactic category (since it is possible
for synsets in different <B>data.<I>pos </I></B> files to have the same <I>synset_offset
</I>). <P>
The WordNet system provide both command line and window-based browser
interfaces to the database. Both interfaces utilize a common library of
search and morphology code. The source code for the library and interfaces
is included in the WordNet package. See <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
for an overview of
the WordNet source code.
<H2><A NAME="sect8" HREF="#toc8">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory
for WordNet. Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in
which the WordNet database has been installed. Default is <B>WNHOME/dict
</B>. </DD>
</DL>
<H2><A NAME="sect9" HREF="#toc9">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory
for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect10" HREF="#toc10">FILES </A></H2>
<DL>
<DT><B>index.<I>pos </I></B> </DT>
<DD>database
index files </DD>
<DT><B>data.<I>pos </I></B> </DT>
<DD>database data files </DD>
<DT><B>*.vrb</B> </DT>
<DD>files of sentences illustrating
the use of verbs </DD>
<DT><B><I>pos </I>.exc</B> </DT>
<DD>morphology exception lists </DD>
</DL>
<H2><A NAME="sect11" HREF="#toc11">SEE ALSO </A></H2>
<B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
,
<B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="cntlist.5WN.html">cntlist</B>(5WN)</A>
,
<B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
, <B><A HREF="morphy.7WN.html">morphy</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
,
<B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
, <B><A HREF="wnstats.7WN.html">wnstats</B>(7WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">Index File Format</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Data File Format</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Sense Numbers</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Exception List File Format</A></LI>
<LI><A NAME="toc6" HREF="#sect6">Verb Example Sentences</A></LI>
</UL>
<LI><A NAME="toc7" HREF="#sect7">NOTES</A></LI>
<LI><A NAME="toc8" HREF="#sect8">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc9" HREF="#sect9">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc10" HREF="#sect10">FILES</A></LI>
<LI><A NAME="toc11" HREF="#sect11">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,325 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNGLOSS(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wngloss - glossary of terms used in WordNet system
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION
</A></H2>
The <I>WordNet Reference Manual </I> consists of Unix-style manual pages divided
into sections as follows: <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>Section </B> </TD> <TD ALIGN=CENTER><B>Description </B> </TD> </TR>
<TR> <TR> <TD ALIGN=CENTER>1 </TD> <TD ALIGN=LEFT>WordNet User
Commands </TD> </TR>
<TR> <TD ALIGN=CENTER>3 </TD> <TD ALIGN=LEFT>WordNet Library Functions </TD> </TR>
<TR> <TD ALIGN=CENTER>5 </TD> <TD ALIGN=LEFT>WordNet File Formats </TD> </TR>
<TR> <TD ALIGN=CENTER>7 </TD> <TD ALIGN=LEFT>Miscellaneous Information about WordNet </TD> </TR>
</TABLE>
<P>
<H3><A NAME="sect2" HREF="#toc2">System Description </A></H3>
The
WordNet system consists of lexicographer files, code to convert these
files into a database, and search routines and interfaces that display
information from the database. The lexicographer files organize nouns,
verbs, adjectives and adverbs into groups of synonyms, and describe relations
between synonym groups. <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
converts the lexicographer files into
a database that encodes the relations between the synonym groups. The
different interfaces to the WordNet database utilize a common library
of search routines to display these relations. Note that the lexicographer
files and <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
program are not generally distributed. <P>
<H3><A NAME="sect3" HREF="#toc3">Database
Organization </A></H3>
Information in WordNet is organized around logical groupings
called synsets. Each synset consists of a list of synonymous words or
collocations (eg. <B>"fountain pen" </B>, <B>"take in" </B>), and pointers that describe
the relations between this synset and other synsets. A word or collocation
may appear in more than one synset, and in more than one part of speech.
The words in a synset are grouped such that they are interchangeable
in some context. <P>
Two kinds of relations are represented by pointers: lexical
and semantic. Lexical relations hold between semantically related word
forms; semantic relations hold between word meanings. These relations
include (but are not limited to) hypernymy/hyponymy (superordinate/subordinate),
antonymy, entailment, and meronymy/holonymy. <P>
Nouns and verbs are organized
into hierarchies based on the hypernymy/hyponymy relation between synsets.
Additional pointers are be used to indicate other relations. <P>
Adjectives
are arranged in clusters containing head synsets and satellite synsets.
Each cluster is organized around antonymous pairs (and occasionally antonymous
triplets). The antonymous pairs (or triplets) are indicated in the head
synsets of a cluster. Most head synsets have one or more satellite synsets,
each of which represents a concept that is similar in meaning to the concept
represented by the head synset. One way to think of the adjective cluster
organization is to visualize a wheel, with a head synset as the hub and
satellite synsets as the spokes. Two or more wheels are logically connected
via antonymy, which can be thought of as an axle between the wheels. <P>
Pertainyms
are relational adjectives and do not follow the structure just described.
Pertainyms do not have antonyms; the synset for a pertainym most often
contains only one word or collocation and a lexical pointer to the noun
that the adjective is "pertaining to". Participial adjectives have lexical
pointers to the verbs that they are derived from. <P>
Adverbs are often derived
from adjectives, and sometimes have antonyms; therefore the synset for
an adverb usually contains a lexical pointer to the adjective from which
it is derived. <P>
See <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
for a detailed description of the database
files and how the data are represented.
<H2><A NAME="sect4" HREF="#toc4">GLOSSARY OF TERMS </A></H2>
Many terms
used in the <I>WordNet Reference Manual </I> are unique to the WordNet system.
Other general terms have specific meanings when used in the WordNet documentation.
Definitions for many of these terms are given to help with the interpretation
and understanding of the reference manual, and in the use of the WordNet
system. <P>
In following definitions <B>word </B> is used in place of <B>word or collocation
</B>.
<DL>
<DT><B>adjective cluster</B> </DT>
<DD>A group of adjective synsets that are organized around
antonymous pairs or triplets. An adjective cluster contains two or more
<B>head synsets </B> which represent antonymous concepts. Each head synset has
one or more <B>satellite synsets </B>. </DD>
<DT><B>attribute</B> </DT>
<DD>A noun for which adjectives
express values. The noun <B>weight </B> is an attribute, for which the adjectives
<B>light </B> and <B>heavy </B> express values. </DD>
<DT><B>base form</B> </DT>
<DD>The base form of a word
or collocation is the form to which inflections are added. </DD>
<DT><B>basic synset</B>
</DT>
<DD>Syntactically, same as <B>synset </B>. Term is used in <B><A HREF="wninput.5WN.html">wninput</B>(5WN)<B></B></A>
to help
explain differences in entering synsets in lexicographer files. </DD>
<DT><B>collocation</B>
</DT>
<DD>A collocation in WordNet is a string of two or more words, connected
by spaces or hyphens. Examples are: <B>man-eating&nbsp;shark </B>, <B>blue-collar </B>, <B>depend&nbsp;on
</B>, <B>line&nbsp;of&nbsp;products </B>. In the database files spaces are represented as underscore
(<B>_ </B>) characters. </DD>
<DT><B>coordinate</B> </DT>
<DD>Coordinate terms are nouns or verbs that have
the same <B>hypernym </B>. </DD>
<DT><B>cross-cluster pointer</B> </DT>
<DD>A <B>semantic pointer </B> from one
adjective cluster to another. </DD>
<DT><B>derivationally related forms</B> </DT>
<DD>Terms in different
syntactic categories that have the same root form and are semantically
related. </DD>
<DT><B>direct antonyms</B> </DT>
<DD>A pair of words between which there is an associative
bond resulting from their frequent co-occurrence. In <B>adjective clusters
</B>, direct antonyms appears only in <B>head synsets </B>. </DD>
<DT><B>domain</B> </DT>
<DD>A topical classification
to which a synset has been linked with a CATEGORY, REGION or USAGE pointer.
</DD>
<DT><B>domain term</B> </DT>
<DD>A synset belonging to a topical class. A domain term is further
identified as being a CATEGORY_TERM, REGION_TERM or USAGE_TERM. </DD>
<DT><B>entailment</B>
</DT>
<DD>A verb <B>X </B> entails <B>Y </B> if <B>X </B> cannot be done unless <B>Y </B> is, or has been,
done. </DD>
<DT><B>exception list</B> </DT>
<DD>Morphological transformations for words that are
not regular and therefore cannot be processed in an algorithmic manner.
</DD>
<DT><B>group</B> </DT>
<DD>Verb senses that similar in meaning and have been manually grouped
together. </DD>
<DT><B>gloss</B> </DT>
<DD>Each synset contains <B>gloss </B> consisting of a definition
and optionally example sentences. </DD>
<DT><B>head synset</B> </DT>
<DD>Synset in an adjective <B>cluster
</B> containing at least one word that has a <B>direct antonym </B>. </DD>
<DT><B>holonym</B> </DT>
<DD>The
name of the whole of which the meronym names a part. <B>Y </B> is a holonym
of <B>X </B> if <B>X </B> is a part of <B>Y </B>. </DD>
<DT><B>hypernym</B> </DT>
<DD>The generic term used to designate
a whole class of specific instances. <B>Y </B> is a hypernym of <B>X </B> if <B>X </B> is a
(kind of) <B>Y </B>. </DD>
<DT><B>hyponym</B> </DT>
<DD>The specific term used to designate a member of
a class. <B>X </B> is a hyponym of <B>Y </B> if <B>X </B> is a (kind of) <B>Y </B>. </DD>
<DT><B>indirect antonym</B>
</DT>
<DD>An adjective in a <B>satellite synset </B> that does not have a <B>direct antonym
</B> has an indirect antonyms via the direct antonym of the <B>head synset </B>. </DD>
<DT><B>instance</B>
</DT>
<DD>A proper noun that refers to a particular, unique referent (as distinguished
from nouns that refer to classes). This is a specific form of hyponym.
</DD>
<DT><B>lemma</B> </DT>
<DD>Lower case ASCII text of word as found in the WordNet database
index files. Usually the <B>base form </B> for a word or collocation. </DD>
<DT><B>lexical
pointer</B> </DT>
<DD>A lexical pointer indicates a relation between words in synsets
(word forms). </DD>
<DT><B>lexicographer file</B> </DT>
<DD>Files containing the raw data for WordNet
synsets, edited by lexicographers, that are input to the <B>grind </B> program
to generate a WordNet database. </DD>
<DT><B>lexicographer id (lex id)</B> </DT>
<DD>A decimal integer
that, when appended onto <B>lemma </B>, uniquely identifies a sense within a
lexicographer file. </DD>
<DT><B>monosemous</B> </DT>
<DD>Having only one sense in a syntactic category.
</DD>
<DT><B>meronym</B> </DT>
<DD>The name of a constituent part of, the substance of, or a member
of something. <B>X </B> is a meronym of <B>Y </B> if <B>X </B> is a part of <B>Y </B>. </DD>
<DT><B>part of speech</B>
</DT>
<DD>WordNet defines "part of speech" as either noun, verb, adjective, or
adverb. Same as <B>syntactic category </B>. </DD>
<DT><B>participial adjective</B> </DT>
<DD>An adjective
that is derived from a verb. </DD>
<DT><B>pertainym</B> </DT>
<DD>A relational adjective. Adjectives
that are pertainyms are usually defined by such phrases as "of or pertaining
to" and do not have antonyms. A pertainym can point to a noun or another
pertainym. </DD>
<DT><B>polysemous</B> </DT>
<DD>Having more than one sense in a syntactic category.
</DD>
<DT><B>polysemy count</B> </DT>
<DD>Number of senses of a word in a syntactic category, in
WordNet. </DD>
<DT><B>postnominal</B> </DT>
<DD>A postnominal adjective occurs only immediately following
the noun that it modifies. </DD>
<DT><B>predicative</B> </DT>
<DD>An adjective that can be used
only in predicate positions. If <B>X </B> is a predicate adjective, it can only
be used in such phrases as "it is <B>X </B>" and never prenominally. </DD>
<DT><B>prenominal</B>
</DT>
<DD>An adjective that can occur only before the noun that it modifies: it
cannot be used predicatively. </DD>
<DT><B>satellite synset</B> </DT>
<DD>Synset in an adjective
<B>cluster </B> representing a concept that is similar in meaning to the concept
represented by its <B>head synset </B>. </DD>
<DT><B>semantic concordance</B> </DT>
<DD>A textual corpus
(e.g. the Brown Corpus) and a lexicon (e.g. WordNet) so combined that every
substantive word in the text is linked to its appropriate sense in the
lexicon via a <B>semantic tag </B>. </DD>
<DT><B>semantic tag</B> </DT>
<DD>A pointer from a word in a text
file to a specific sense of that word in the WordNet database. A semantic
tag in a semantic concordance is represented by a <B>sense key </B>. </DD>
<DT><B>semantic
pointer</B> </DT>
<DD>A semantic pointer indicates a relation between synsets (concepts).
</DD>
<DT><B>sense</B> </DT>
<DD>A meaning of a word in WordNet. Each sense of a word is in a different
<B>synset </B>. </DD>
<DT><B>sense key</B> </DT>
<DD>Information necessary to find a sense in the WordNet
database. A sense key combines a <B>lemma </B> field and codes for the synset
type, lexicographer id, lexicographer file number, and information about
a satellite's <B>head synset </B>, if required. See <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
for a description
of the format of a sense key. </DD>
<DT><B>subordinate</B> </DT>
<DD>Same as <B>hyponym </B>. </DD>
<DT><B>superordinate</B>
</DT>
<DD>Same as <B>hypernym </B>. </DD>
<DT><B>synset</B> </DT>
<DD>A synonym set; a set of words that are interchangeable
in some context without changing the truth value of the preposition in
which they are embedded. </DD>
<DT><B>troponym</B> </DT>
<DD>A verb expressing a specific manner
elaboration of another verb. <B>X </B> is a troponym of <B>Y </B> if <B>to X </B> is <B>to Y </B> in
some manner. </DD>
<DT><B>unique beginner</B> </DT>
<DD>A noun synset with no <B>superordinate </B>. </DD>
</DL>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">System Description</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Database Organization</A></LI>
</UL>
<LI><A NAME="toc4" HREF="#sect4">GLOSSARY OF TERMS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,80 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNGROUPS(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wngroups - discussion of WordNet search code to group similar verb
senses
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
Some similar senses of verbs have been grouped by
the lexicographers. This grouping is done statically in the lexicographer
source files using the semantic <I>pointer_symbol </I> <B>$ </B>. Transitivity is used
to combine groups of overlapping senses into the largest sense groups
possible.
<H2><A NAME="sect2" HREF="#toc2">NOTES </A></H2>
Coverage of verb groups is incomplete.
<H2><A NAME="sect3" HREF="#toc3">ENVIRONMENT VARIABLES
(UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet. Default is <B>/usr/local/WordNet-3.0
</B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the WordNet database has been installed.
Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect4" HREF="#toc4">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B>
</DT>
<DD>Base directory for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect5" HREF="#toc5">FILES
</A></H2>
<DL>
<DT><B>sentidx.vrb</B> </DT>
<DD>verb sense keys and sentence frame numbers </DD>
<DT><B>sents.vrb</B> </DT>
<DD>example
sentence frames </DD>
</DL>
<H2><A NAME="sect6" HREF="#toc6">SEE ALSO </A></H2>
<B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
, <B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
,
<B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
. <P>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<LI><A NAME="toc2" HREF="#sect2">NOTES</A></LI>
<LI><A NAME="toc3" HREF="#sect3">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc4" HREF="#sect4">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc5" HREF="#sect5">FILES</A></LI>
<LI><A NAME="toc6" HREF="#sect6">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,491 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNINPUT(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
noun.<I>suffix </I>, verb.<I>suffix </I>, adj.<I>suffix </I>, adv.<I>suffix </I> - WordNet lexicographer
files that are input to <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
WordNet's source files
are written by lexicographers. They are the product of a detailed relational
analysis of lexical semantics: a variety of lexical and semantic relations
are used to represent the organization of lexical knowledge. Two kinds
of building blocks are distinguished in the source files: word forms and
word meanings. Word forms are represented in their familiar orthography;
word meanings are represented by synonym sets (<I>synset </I>s) - lists of synonymous
word forms that are interchangeable in some context. Two kinds of relations
are recognized: lexical and semantic. Lexical relations hold between word
forms; semantic relations hold between word meanings. <P>
Lexicographer files
correspond to the syntactic categories implemented in WordNet - noun, verb,
adjective and adverb. All of the synsets in a lexicographer file are in
the same syntactic category. Each synset consists of a list of synonymous
words or collocations (eg. <B>"fountain pen" </B>, <B>"take in" </B>), and pointers that
describe the relations between this synset and other synsets. These relations
include (but are not limited to) hypernymy/hyponymy, antonymy, entailment,
and meronymy/holonymy. A word or collocation may appear in more than one
synset, and in more than one part of speech. Each use of a word in a synset
represents a sense of that word in the part of speech corresponding to
the synset. <P>
Adjectives may be organized into clusters containing head
synsets and satellite synsets. Adverbs generally point to the adjectives
from which they are derived. <P>
See <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
for a glossary of WordNet
terminology and a discussion of the database's content and logical organization.
<H3><A NAME="sect2" HREF="#toc2">Lexicographer File Names </A></H3>
The names of the lexicographer files are of
the form: <P>
<blockquote><I>pos</I>.<I>suffix</I> </blockquote>
<P>
where <I>pos </I> is either <B>noun </B>, <B>verb </B>, <B>adj </B> or <B>adv
</B>. <I>suffix </I> may be used to organize groups of synsets into different files,
for example <B>noun.animal </B> and <B>noun.plant </B>. See <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
for a list of
lexicographer file names that are used in building WordNet.
<H3><A NAME="sect3" HREF="#toc3">Pointers </A></H3>
Pointers
are used to represent the relations between the words in one synset and
another. Semantic pointers represent relations between word meanings,
and therefore pertain to all of the words in the source and target synsets.
Lexical pointers represent relations between word forms, and pertain
only to specific words in the source and target synsets. The following
pointer types are usually used to indicate lexical relations: Antonym,
Pertainym, Participle, Also See, Derivationally Related. The remaining
pointer types are generally used to represent semantic relations. <P>
A relation
from a source to a target synset is formed by specifying a word from the
target synset in the source synset, followed by the <I>pointer_symbol </I> indicating
the pointer type. The location of a pointer within a synset defines it
as either lexical or semantic. The <FONT SIZE=-1><B>Lexicographer File Format </B></FONT>
section
describes the syntax for entering a semantic pointer, and <FONT SIZE=-1><B>Word Syntax
</B></FONT>
describes the syntax for entering a lexical pointer. <P>
Although there
are many pointer types, only certain types of relations are permitted
between synsets of each syntactic category. <P>
The <I>pointer_symbol </I>s for nouns
are: <blockquote><B>! </B> <tt> </tt>&nbsp;<tt> </tt>&nbsp;Antonym <BR>
<B>@ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Hypernym <BR>
<B>@i </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Instance Hypernym <BR>
<B>&nbsp; </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Hyponym <BR>
<B>&nbsp;i </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Instance
Hyponym <BR>
<B>#m </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Member holonym <BR>
<B>#s </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Substance holonym <BR>
<B>#p </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Part holonym <BR>
<B>%m
</B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Member meronym <BR>
<B>%s </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Substance meronym <BR>
<B>%p </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Part meronym <BR>
<B>= </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Attribute <BR>
<B>+
</B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Derivationally related form<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp; <BR>
<B>;c </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - TOPIC <BR>
<B>-c </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Member of this
domain - TOPIC <BR>
<B>;r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - REGION <BR>
<B>-r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Member of this domain - REGION
<BR>
<B>;u </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - USAGE <BR>
<B>-u </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Member of this domain - USAGE <BR>
</blockquote>
<P>
The <I>pointer_symbol
</I>s for verbs are: <blockquote><B>! </B> <tt> </tt>&nbsp;<tt> </tt>&nbsp;Antonym <BR>
<B>@ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Hypernym <BR>
<B>&nbsp; </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Hyponym <BR>
<B>* </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Entailment <BR>
<B>&gt; </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Cause
<BR>
<B>^ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Also see <BR>
<B>$ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Verb Group <BR>
<B>+ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Derivationally related form<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp; <BR>
<B>;c </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of
synset - TOPIC <BR>
<B>;r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - REGION <BR>
<B>;u </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - USAGE
<BR>
</blockquote>
<P>
The <I>pointer_symbol </I>s for adjectives are: <blockquote><B>! </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Antonym <BR>
<B>&amp; </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Similar to <BR>
<B>&lt;
</B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Participle of verb <BR>
<B>\ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Pertainym (pertains to noun) <BR>
<B>= </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Attribute <BR>
<B>^ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Also
see <BR>
<B>;c </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - TOPIC <BR>
<B>;r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - REGION <BR>
<B>;u </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain
of synset - USAGE <BR>
</blockquote>
<P>
The <I>pointer_symbol </I>s for adverbs are: <blockquote><B>! </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Antonym <BR>
<B>\ </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Derived from adjective <BR>
<B>;c </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - TOPIC <BR>
<B>;r </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset
- REGION <BR>
<B>;u </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;Domain of synset - USAGE <BR>
</blockquote>
<P>
Many pointer types are reflexive,
meaning that if a synset contains a pointer to another synset, the other
synset should contain a corresponding reflexive pointer. <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
automatically
inserts missing reflexive pointers for the following pointer types: <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>Pointer </B> </TD> <TD ALIGN=CENTER><B>Reflect </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Antonym </TD> <TD ALIGN=LEFT>Antonym </TD> </TR>
<TR> <TD ALIGN=LEFT>Hyponym </TD> <TD ALIGN=LEFT>Hypernym </TD> </TR>
<TR> <TD ALIGN=LEFT>Hypernym
</TD> <TD ALIGN=LEFT>Hyponym </TD> </TR>
<TR> <TD ALIGN=LEFT>Instance Hyponym </TD> <TD ALIGN=LEFT>Instance Hypernym </TD> </TR>
<TR> <TD ALIGN=LEFT>Instance Hypernym </TD>
<TD ALIGN=LEFT>Instance Hyponym </TD> </TR>
<TR> <TD ALIGN=LEFT>Holonym </TD> <TD ALIGN=LEFT>Meronym </TD> </TR>
<TR> <TD ALIGN=LEFT>Meronym </TD> <TD ALIGN=LEFT>Holonym </TD> </TR>
<TR> <TD ALIGN=LEFT>Similar to
</TD> <TD ALIGN=LEFT>Similar to </TD> </TR>
<TR> <TD ALIGN=LEFT>Attribute </TD> <TD ALIGN=LEFT>Attribute </TD> </TR>
<TR> <TD ALIGN=LEFT>Verb Group </TD> <TD ALIGN=LEFT>Verb Group </TD> </TR>
<TR> <TD ALIGN=LEFT>Derivationally
Related </TD> <TD ALIGN=LEFT>Derivationally Related </TD> </TR>
<TR> <TD ALIGN=LEFT>Domain of synset </TD> <TD ALIGN=LEFT>Member of Doman </TD>
</TR>
</TABLE>
<H3><A NAME="sect4" HREF="#toc4">Verb Frames </A></H3>
Each verb synset contains a list of generic sentence frames
illustrating the types of simple sentences in which the verbs in the synset
can be used. For some verb senses, example sentences illustrating actual
uses of the verb are provided. (See <FONT SIZE=-1><B>Verb Example Sentences </B></FONT>
in <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
.)
Whenever there is no example sentence, the generic sentence frames specified
by the lexicographer are used. The generic sentence frames are entered
in a synset as a comma-separated list of integer frame numbers. The following
list is the text of the generic frames, preceded by their frame numbers:
<P>
<blockquote>1<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something ----s <BR>
2<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s <BR>
3<tt> </tt>&nbsp;<tt> </tt>&nbsp;It is ----ing <BR>
4<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something is ----ing PP <BR>
5<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something
----s something Adjective/Noun <BR>
6<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something ----s Adjective/Noun <BR>
7<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s Adjective
<BR>
8<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something <BR>
9<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody <BR>
10<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something ----s somebody <BR>
11<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something ----s something <BR>
12<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something ----s to somebody <BR>
13<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s on something
<BR>
14<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody something <BR>
15<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something to somebody <BR>
16<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something from somebody <BR>
17<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody with something
<BR>
18<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody of something <BR>
19<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something on somebody
<BR>
20<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody PP <BR>
21<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something PP <BR>
22<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s PP
<BR>
23<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody's (body part) ----s <BR>
24<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s somebody to INFINITIVE <BR>
25<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody
----s somebody INFINITIVE <BR>
26<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s that CLAUSE <BR>
27<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s to somebody
<BR>
28<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s to INFINITIVE <BR>
29<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s whether INFINITIVE <BR>
30<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody
----s somebody into V-ing something <BR>
31<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s something with something
<BR>
32<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s INFINITIVE <BR>
33<tt> </tt>&nbsp;<tt> </tt>&nbsp;Somebody ----s VERB-ing <BR>
34<tt> </tt>&nbsp;<tt> </tt>&nbsp;It ----s that CLAUSE <BR>
35<tt> </tt>&nbsp;<tt> </tt>&nbsp;Something
----s INFINITIVE <BR>
</blockquote>
<H3><A NAME="sect5" HREF="#toc5">Lexicographer File Format </A></H3>
Synsets are entered one per
line, and each line is terminated with a newline character. A line containing
a synset may be as long as necessary, but no newlines can be entered within
a synset. Within a synset, spaces or tabs may be used to separate entities.
Items enclosed in italicized square brackets may not be present. <P>
The
general synset syntax is: <P>
<blockquote><B>{ </B> <I>&nbsp;&nbsp;words&nbsp;&nbsp;pointers&nbsp;&nbsp; </I> <B>( </B> <I>&nbsp;gloss&nbsp; </I> <B>)&nbsp;&nbsp;} </B>
<BR>
</blockquote>
<P>
Synsets of this form are valid for all syntactic categories except
verb, and are referred to as basic synsets. At least one <I>word </I> and a <I>gloss
</I> are required to form a valid synset. Pointers entered following all the
<I>words </I> in a synset represent semantic relations between all the words
in the source and target synsets. <P>
For verbs, the basic synset syntax is
defined as follows: <P>
<blockquote><B>{ </B> <I>&nbsp;&nbsp;words&nbsp;&nbsp;pointers&nbsp;&nbsp;frames&nbsp;&nbsp; </I> <B>( </B> &nbsp;<I>gloss&nbsp; </I> <B>)&nbsp;&nbsp;}
</B> <BR>
</blockquote>
<P>
Adjective may be organized into clusters containing one or more head
synsets and optional satellite synsets. Adjective clusters are of the
form: <P>
<blockquote><B>[ </B><BR>
<I>head synset </I><BR>
[satellite synsets] <BR>
[-] <BR>
[additional head/satellite
synsets] <BR>
<B>] </B> <BR>
</blockquote>
<P>
Each adjective cluster is enclosed in square brackets,
and may have one or more parts. Each part consists of a head synset and
optional satellite synsets that are conceptually similar to the head synset's
meaning. Parts of a cluster are separated by one or more hyphens (<B>- </B>) on
a line by themselves, with the terminating square bracket following the
last synset. Head and satellite synsets follow the syntax of basic synsets,
however a "Similar to" pointer must be specified in a head synset for
each of its satellite synsets. Most adjective clusters contain two antonymous
parts. See <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
for a discussion of adjective clusters, and <FONT SIZE=-1><B>Special
Adjective Syntax </B></FONT>
for more information on adjective cluster syntax. <P>
Synsets
for relational adjectives (pertainyms) and participial adjectives do not
adhere to the cluster structure. They use the basic synset syntax. <P>
Comments
can be entered in a lexicographer file by enclosing the text of the comment
in parentheses. Note that comments <B>cannot </B> appear within a synset, as
parentheses within a synset have an entirely different meaning (see <FONT SIZE=-1><B>Gloss
Syntax </B></FONT>
). However, entire synsets (or adjective clusters) can be "commented
out" by enclosing them in parentheses. This is often used by the lexicographers
to verify the syntax of files under development or to leave a note to
oneself while working on entries.
<H3><A NAME="sect6" HREF="#toc6">Word Syntax </A></H3>
A synset must have at least
one word, and the words of a synset must appear after the opening brace
and before any other synset constructs. A word may be entered in either
the simple word or word/pointer syntax. <P>
A simple word is of the form:
<P>
<blockquote><I>word[ </I> <B>( </B> <I>marker </I> <B>) </B> <I>][lex_id] </I> <B>, </B> <BR>
</blockquote>
<P>
<I>word </I> may be entered in any combination
of upper and lower case unless it is in an adjective cluster. A collocation
is entered by joining the individual words with an underscore character
(<B>_ </B>). Numbers (integer or real) may be entered, either by themselves or
as part of a word string, by following the number with a double quote
(<B>" </B>). <P>
See <FONT SIZE=-1><B>Special Adjective Syntax </B></FONT>
for a description of adjective clusters
and markers. <P>
<I>word </I> may be followed by an integer <I>lex_id </I> from <B>1 </B> to <B>15
</B>. The <I>lex_id </I> is used to distinguish different senses of the same word
within a lexicographer file. The lexicographer assigns <I>lex_id </I> values,
usually in ascending order, although there is no requirement that the
numbers be consecutive. The default is <B>0 </B>, and does not have to be specified.
A <I>lex_id </I> must be used on pointers if the desired sense has a non-zero
<I>lex_id </I> in its synset specification. <P>
Word/pointer syntax is of the form:
<P>
<blockquote><B>[&nbsp;&nbsp; </B> <I>word[ </I> <B>( </B> <I>marker </I> <B>) </B> <I>][lex_id] </I> <B>, </B> <I>&nbsp;&nbsp;pointers&nbsp;&nbsp; </I> <B>] </B> <BR>
</blockquote>
<P>
This syntax
is used when one or more pointers correspond only to the specific word
in the word/pointer set, rather than all the words in the synset, and
represents a lexical relation. Note that a word/pointer set appears within
a synset, therefore the square brackets used to enclose it are treated
differently from those used to define an adjective cluster. Only one word
can be specified in each word/pointer set, and any number of pointers
may be included. A synset can have any number of word/pointer sets. Each
is treated by <B><A HREF="grind.1WN.html">grind</B>(1WN)<B></B></A>
essentially as a <I>word </I>, so they all must appear
before any synset <I>pointers </I> representing semantic relations. <P>
For verbs,
the word/pointer syntax is extended in the following manner to allow the
user to specify generic sentence frames that, like pointers, correspond
only to a specific word, rather than all the words in the synset. In this
case, <I>pointers </I> are optional. <P>
<blockquote><B>[&nbsp;&nbsp; </B> <I>word </I> <B>, </B> &nbsp;&nbsp;<I>[pointers]&nbsp;&nbsp;frames&nbsp;&nbsp; </I> <B>]
</B> <BR>
</blockquote>
<H3><A NAME="sect7" HREF="#toc7">Pointer Syntax </A></H3>
Pointers are optional in synsets. If a pointer is specified
outside of a word/pointer set, the relation is applied to all of the words
in the synset, including any words specified using the word/pointer syntax.
This indicates a semantic relation between the meanings of the words
in the synsets. If specified within a word/pointer set, the relation corresponds
only to the word in the set and represents a lexical relation. <P>
A pointer
is of the form: <P>
<blockquote><I>[lex_filename </I><B>: </B> <I>]word[lex_id] </I><B>, </B><I>pointer_symbol </I> <BR>
</blockquote>
<P>
or: <P>
<blockquote><I>[lex_filename </I><B>: </B> <I>]word[lex_id] </I><B>^ </B><I>word[lex_id] </I><B>, </B><I>pointer_symbol </I> <BR>
</blockquote>
<P>
For pointers, <I>word </I> indicates a word in another synset. When the second
form of a pointer is used, the first <I>word </I> indicates a word in a head
synset, and the second is a word in a satellite of that cluster. <I>word
</I> may be followed by a <I>lex_id </I> that is used to match the pointer to the
correct target synset. The synset containing <I>word </I> may reside in another
lexicographer file. In this case, <I>word </I> is preceded by <I>lex_filename </I> as
shown. <P>
See <FONT SIZE=-1><B>Pointers </B></FONT>
for a list of <I>pointer_symbol </I>s and their meanings.
<H3><A NAME="sect8" HREF="#toc8">Verb Frame List Syntax </A></H3>
Frame numbers corresponding to generic sentence
frames must be entered in each verb synset. If a frame list is specified
outside of a word/pointer set, the verb frames in the list apply to all
of the words in the synset, including any words specified using the word/pointer
syntax. If specified within a word/pointer set, the verb frames in the
list correspond only to the word in the set. <P>
A frame number list is entered
as follows: <P>
<blockquote><B>frames: </B>&nbsp;&nbsp;<I>f_num </I>[<B>, </B><I>f_num...] </I> </blockquote>
<P>
Where <I>f_num </I> specifies a generic
frame number. See <FONT SIZE=-1><B>Verb Frames </B></FONT>
for a list of generic sentences and their
corresponding frame numbers.
<H3><A NAME="sect9" HREF="#toc9">Gloss Syntax </A></H3>
A gloss is included in all synsets.
The lexicographer may enter a text string of any length desired. A gloss
is simply a string enclosed in parentheses with no embedded carriage returns.
It provides a definition of what the synset represents and/or example
sentences.
<H3><A NAME="sect10" HREF="#toc10">Special Adjective Syntax </A></H3>
The syntax for representing antonymous
adjective synsets requires several additional conditions. <P>
The first word
of a head synset <B>must </B> be entered in upper case, and can be thought of
as the head word of the head synset. The <I>word </I> part of a pointer from
one head synset to another head synset within the same cluster (usually
an antonym) must also be entered in upper case. Usually antonymous adjectives
are entered using the word/pointer syntax described in <FONT SIZE=-1><B>Word Syntax </B></FONT>
to
indicate a lexical relation. There is no restriction on the number of
parts that a cluster may have, and some clusters have three parts, representing
antonymous triplets, such as <B>solid </B>, <B>liquid </B>, and <B>gas </B>. <P>
A cross-cluster
pointer may be specified, allowing a head or satellite synset to point
to a head synset in a different cluster. A cross-cluster pointer is indicated
by entering the <I>word </I> part of the pointer in upper case. <P>
An adjective
may be annotated with a syntactic marker indicating a limitation on the
syntactic position the adjective may have in relation to noun that it
modifies. If so marked, the marker appears between the word and its following
comma. If a <I>lex_id </I> is specified, the marker immediately follows it. The
syntactic markers are: <blockquote><B>(p) </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;predicate position <BR>
<B>(a) </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;prenominal (attributive)
position <BR>
<B>(ip) </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;immediately postnominal position<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp;<tt> </tt>&nbsp; <BR>
</blockquote>
<H2><A NAME="sect11" HREF="#toc11">EXAMPLES </A></H2>
<I>(Note that
these are hypothetical examples not found in the WordNet lexicographer
files.) </I> <P>
Sample noun synsets: <blockquote>{ canine, [ dog1, cat,! ] pooch, canid,@
} <BR>
{ collie, dog1,@ (large multi-colored dog with pointy nose) } <BR>
{ hound,
hunting_dog, pack,#m dog1,@ } <BR>
{ dog, } <BR>
</blockquote>
<P>
Sample verb synsets: <blockquote>{ [ confuse,
clarify,! frames: 1 ] blur, obscure, frames: 8, 10 } <BR>
{ [ clarify, confuse,!
] make_clear, interpret,@ frames: 8 } <BR>
{ interpret, construe, understand,@
frames: 8 } <BR>
</blockquote>
<P>
Sample adjective clusters: <blockquote>[ <BR>
{ [ HOT, COLD,! ] lukewarm(a),
TEPID,^ (hot to the touch) } <BR>
{ warm, } <BR>
- <BR>
{ [ COLD, HOT,! ] frigid, (cold
to the touch) } <BR>
{ freezing, } <BR>
] <BR>
</blockquote>
<P>
Sample adverb synsets: <blockquote>{ [ basically,
adj.all:essential^basic,\ ] [ essentially, adj.all:basic^fundamental,\ ] ( by
one's very nature )} <BR>
{ pointedly, adj.all:pungent^pointed,\ } <BR>
{ [ badly,
adj.all:bad,\ well,! ] ill, ("He was badly prepared") } <BR>
</blockquote>
<H2><A NAME="sect12" HREF="#toc12">SEE ALSO </A></H2>
<B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
,
<B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="uniqbeg.7WN.html">uniqbeg</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
. <P>
Fellbaum,
C. (1998), ed. <I>"WordNet: An Electronic Lexical Database" </I>. MIT Press, Cambridge,
MA. <P>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">Lexicographer File Names</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Pointers</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Verb Frames</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Lexicographer File Format</A></LI>
<LI><A NAME="toc6" HREF="#sect6">Word Syntax</A></LI>
<LI><A NAME="toc7" HREF="#sect7">Pointer Syntax</A></LI>
<LI><A NAME="toc8" HREF="#sect8">Verb Frame List Syntax</A></LI>
<LI><A NAME="toc9" HREF="#sect9">Gloss Syntax</A></LI>
<LI><A NAME="toc10" HREF="#sect10">Special Adjective Syntax</A></LI>
</UL>
<LI><A NAME="toc11" HREF="#sect11">EXAMPLES</A></LI>
<LI><A NAME="toc12" HREF="#sect12">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,81 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNINTRO(1WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnintro - WordNet user commands
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>wn </B> - command line interface
to WordNet database <P>
<B>wnb </B> - window based WordNet browser
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
This
section of the <I>WordNet Reference Manual </I> contains manual pages that describe
commands available with the various WordNet system packages. <P>
The WordNet
interfaces <B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
and <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
allow the user to search the WordNet
database and display the information textually.
<H2><A NAME="sect3" HREF="#toc3">ENVIRONMENT VARIABLES
(UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet. Default is <B>/usr/local/WordNet-3.0
</B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the WordNet database has been installed.
Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect4" HREF="#toc4">REGISTRY (WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B>
</DT>
<DD>Base directory for WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect5" HREF="#toc5">SEE
ALSO </A></H2>
<B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
, <B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
.
<P>
Fellbaum, C. (1998), ed. <I>"WordNet: An Electronic Lexical Database" </I>. MIT
Press, Cambridge, MA.
<H2><A NAME="sect6" HREF="#toc6">AVAILABILITY </A></H2>
WordNet has a World Wide Web site at
<B><A HREF="http://wordnet.princeton.edu">http://wordnet.princeton.edu</A>
</B>. From this web site users can learn about
the WordNet project, run several different interfaces to the WordNet database,
and download various WordNet system packages and <I>"Five Papers on WordNet"
</I>. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc4" HREF="#sect4">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc5" HREF="#sect5">SEE ALSO</A></LI>
<LI><A NAME="toc6" HREF="#sect6">AVAILABILITY</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,365 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNINTRO(3WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnintro - introduction to WordNet library functions
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION
</A></H2>
This section of the <I>WordNet Reference Manual </I> contains manual pages that
describe the WordNet library functions and API. <P>
Functions are organized
into the following categories: <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=LEFT><B>Category </B> </TD> <TD ALIGN=LEFT><B>Manual Page </B> </TD> <TD ALIGN=LEFT><B>Object File
</B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Database Search </TD> <TD ALIGN=LEFT>wnsearch (3WN) </TD> <TD ALIGN=LEFT>search.o </TD> </TR>
<TR> <TD ALIGN=LEFT>Morphology </TD> <TD ALIGN=LEFT>morph (3WN)
</TD> <TD ALIGN=LEFT>morph.o </TD> </TR>
<TR> <TD ALIGN=LEFT>Misc. Utility </TD> <TD ALIGN=LEFT>wnutil (3WN) </TD> <TD ALIGN=LEFT>wnutil.o </TD> </TR>
<TR> <TD ALIGN=LEFT>Binary Search </TD> <TD ALIGN=LEFT>binsrch
(3WN) </TD> <TD ALIGN=LEFT>binsrch.o </TD> </TR>
</TABLE>
<P>
The WordNet library is used by all of the searching
interfaces provided with the various WordNet packages. Additional programs
in the system, such as <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
, also use functions in this library.
<P>
The WordNet library is provided in both source and binary forms (on some
platforms) to allow users to build applications and tools to their own
specifications that utilize the WordNet database. We do not provide programming
support or assistance. <P>
The code conforms to ANSI C standards. Functions
are defined with function prototypes. If you do not have a compiler that
accepts prototypes, you must edit the source code and remove the prototypes
before compiling.
<H2><A NAME="sect2" HREF="#toc2">LIST OF WORDNET LIBRARY FUNCTIONS </A></H2>
Not all library functions
are listed below. Missing are mainly functions that are called by documented
ones, or ones that were written for specific applications or tools used
during WordNet development. Data structures are defined in <B>wn.h </B>. <P>
<H3><A NAME="sect3" HREF="#toc3">Database
Searching Functions (search.o) </A></H3>
<P>
<DL>
<DT><B>findtheinfo </B> </DT>
<DD>Primary search function for
WordNet database. Returns formatted search results in text buffer. Used
by WordNet interfaces to perform requested search. </DD>
<DT><B>findtheinfo_ds</B> </DT>
<DD>Primary
search function for WordNet database. Returns search results in linked
list data structure. </DD>
<DT><B>is_defined</B> </DT>
<DD>Set bit for each search type that is valid
for the search word passed and return bit mask. </DD>
<DT><B>in_wn</B> </DT>
<DD>Set bit for each
syntactic category that search word is in. </DD>
<DT><B>index_lookup</B> </DT>
<DD>Find word in index
file and return parsed entry in data structure. Input word must be exact
match of string in database. Called by <B>getindex() </B>. </DD>
<DT><B>getindex</B> </DT>
<DD>Find word
in index file, trying different techniques - replace hyphens with underscores,
replace underscores with hyphens, strip hyphens and underscores, strip
periods. </DD>
<DT><B>read_synset</B> </DT>
<DD>Read synset from data file at byte offset passed
and return parsed entry in data structure. Calls <B>parse_synset() </B>. </DD>
<DT><B>parse_synset</B>
</DT>
<DD>Read synset at current byte offset in file and return parsed entry in
data structure. </DD>
<DT><B>free_syns</B> </DT>
<DD>Free a synset linked list allocated by <B>findtheinfo_ds()
</B>. </DD>
<DT><B>free_synset</B> </DT>
<DD>Free a synset structure. </DD>
<DT><B>free_index</B> </DT>
<DD>Free an index structure.
</DD>
<DT><B>traceptrs_ds</B> </DT>
<DD>Recursive search algorithm to trace a pointer tree and return
results in linked list. </DD>
<DT><B>do_trace</B> </DT>
<DD>Do requested search on synset passed
returning formatted output in buffer. </DD>
</DL>
<P>
<H3><A NAME="sect4" HREF="#toc4">Morphology Functions (morph.o) </A></H3>
<P>
<DL>
<DT><B>morphinit</B> </DT>
<DD>Open exception list files. </DD>
<DT><B>re_morphinit</B> </DT>
<DD>Close exception list
files and reopen. </DD>
<DT><B>morphstr</B> </DT>
<DD>Try to find base form (lemma) of word or collocation
in syntactic category passed. Calls <B>morphword() </B> for each word in string
passed. </DD>
<DT><B>morphword</B> </DT>
<DD>Try to find base form (lemma) of individual word in
syntactic category passed. </DD>
</DL>
<P>
<H3><A NAME="sect5" HREF="#toc5">Utility Functions (wnutil.o) </A></H3>
<P>
<DL>
<DT><B>wninit</B> </DT>
<DD>Top level
function to open database files and morphology exception lists. </DD>
<DT><B>re_wninit</B>
</DT>
<DD>Top level function to close and reopen database files and morphology
exception lists. </DD>
<DT><B>cntwords</B> </DT>
<DD>Count the number of underscore or space separated
words in a string. </DD>
<DT><B>strtolower</B> </DT>
<DD>Convert string to lower case and remove
trailing adjective marker if found. </DD>
<DT><B>ToLowerCase</B> </DT>
<DD>Convert string passed
to lower case. </DD>
<DT><B>strsubst</B> </DT>
<DD>Replace all occurrences of <I>from </I> with <I>to </I> in <I>str
</I>. </DD>
<DT><B>getptrtype</B> </DT>
<DD>Return code for pointer type character passed. </DD>
<DT><B>getpos</B> </DT>
<DD>Return
syntactic category code for string passed. </DD>
<DT><B>getsstype</B> </DT>
<DD>Return synset type
code for string passed. </DD>
<DT><B>FmtSynset</B> </DT>
<DD>Reconstruct synset string from synset
pointer. </DD>
<DT><B>StrToPos</B> </DT>
<DD>Passed string for syntactic category, returns corresponding
integer value. </DD>
<DT><B>GetSynsetForSense</B> </DT>
<DD>Return synset for sense key passed. </DD>
<DT><B>GetDataOffset</B>
</DT>
<DD>Find synset offset for sense. </DD>
<DT><B>GetPolyCount</B> </DT>
<DD>Find polysemy count for sense
passed. </DD>
<DT><B>GetWORD</B> </DT>
<DD>Return word part of sense key. </DD>
<DT><B>GetPOS</B> </DT>
<DD>Return syntactic
category code for sense key passed. </DD>
<DT><B>WNSnsToStr</B> </DT>
<DD>Generate sense key for
index entry passed. </DD>
<DT><B>GetValidIndexPointer</B> </DT>
<DD>Search for string and/or base
form of word in database and return index structure for word if found.
</DD>
<DT><B>GetWNSense</B> </DT>
<DD>Return sense number in database for sense key. </DD>
<DT><B>GetSenseIndex</B>
</DT>
<DD>Return parsed sense index entry for sense key passed. </DD>
<DT><B>default_display_message</B>
</DT>
<DD>Default function to use as value of <B>display_message </B>. Simply returns
<B>-1 </B>. </DD>
</DL>
<P>
<H3><A NAME="sect6" HREF="#toc6">Binary Search Functions (binsrch.o) </A></H3>
<P>
<DL>
<DT><B>bin_search</B> </DT>
<DD>General purpose binary
search function to search for key as first item on line in sorted file.
</DD>
<DT><B>copyfile</B> </DT>
<DD>Copy contents from one file to another. </DD>
<DT><B>replace_line</B> </DT>
<DD>Replace
a line in a sorted file. </DD>
<DT><B>insert_line</B> </DT>
<DD>Insert a line into a sorted file.
</DD>
</DL>
<H2><A NAME="sect7" HREF="#toc7">HEADER FILE </A></H2>
<DL>
<DT><B>wn.h</B> </DT>
<DD>WordNet include file of constants, data structures,
external declarations for global variables initialized in <B>wnglobal.c </B>. Also
lists function prototypes for library API. It must be included to use any
WordNet library functions. </DD>
</DL>
<H2><A NAME="sect8" HREF="#toc8">NOTES </A></H2>
All library functions that access the
database files expect the files to be open. The function <B><A HREF="wninit.3WN.html">wninit</B>(3WN)</A>
must
be called before other database access functions such as <B><A HREF="findtheinfo.3WN.html">findtheinfo</B>(3WN)</A>
or <B><A HREF="read_synset.3WN.html">read_synset</B>(3WN)</A>
.<B></B> <P>
Inclusion of the header file <B>wn.h </B> is necessary. <P>
The
command line interface is a good example of a simple application that
uses several WordNet library functions. <P>
Many of the library functions
are passed or return syntactic category or synset type information. The
following table lists the possible categories as integer codes, synset
type constant names, syntactic category constant names, single characters
and character strings. <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>Integer </B> </TD> <TD ALIGN=CENTER><B>Synset Type </B> </TD> <TD ALIGN=CENTER><B>Syntactic Category </B>
</TD> <TD ALIGN=CENTER><B>Char </B> </TD> <TD ALIGN=CENTER><B>String </B> </TD> </TR>
<TR> <TR> <TD ALIGN=CENTER>1 </TD> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=LEFT>NOUN </TD> <TD ALIGN=CENTER>n </TD> <TD ALIGN=LEFT>noun </TD> </TR>
<TR> <TD ALIGN=CENTER>2 </TD> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=LEFT>VERB </TD> <TD ALIGN=CENTER>v </TD> <TD ALIGN=LEFT>verb
</TD> </TR>
<TR> <TD ALIGN=CENTER>3 </TD> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=CENTER>a </TD> <TD ALIGN=LEFT>adj </TD> </TR>
<TR> <TD ALIGN=CENTER>4 </TD> <TD ALIGN=LEFT>ADV </TD> <TD ALIGN=LEFT>ADV </TD> <TD ALIGN=CENTER>r </TD> <TD ALIGN=LEFT>adv </TD> </TR>
<TR> <TD ALIGN=CENTER>5 </TD> <TD ALIGN=LEFT>SATELLITE </TD> <TD ALIGN=LEFT>ADJ </TD> <TD ALIGN=CENTER>s
</TD> <TD ALIGN=LEFT><I>n/a </I> </TD> </TR>
</TABLE>
<H2><A NAME="sect9" HREF="#toc9">ENVIRONMENT VARIABLES (UNIX) </A></H2>
<DL>
<DT><B>WNHOME</B> </DT>
<DD>Base directory for WordNet.
Default is <B>/usr/local/WordNet-3.0 </B>. </DD>
<DT><B>WNSEARCHDIR</B> </DT>
<DD>Directory in which the
WordNet database has been installed. Default is <B>WNHOME/dict </B>. </DD>
</DL>
<H2><A NAME="sect10" HREF="#toc10">REGISTRY
(WINDOWS) </A></H2>
<DL>
<DT><B>HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome</B> </DT>
<DD>Base directory for
WordNet. Default is <B>C:\Program&nbsp;Files\WordNet\3.0 </B>. </DD>
</DL>
<H2><A NAME="sect11" HREF="#toc11">FILES </A></H2>
<DL>
<DT><B>lib/libwn.a</B> </DT>
<DD>WordNet
library (Unix) </DD>
<DT><B>lib\wn.lib</B> </DT>
<DD>WordNet library (Windows) </DD>
<DT><B>include</B> </DT>
<DD>header files
for use with WordNet library </DD>
</DL>
<H2><A NAME="sect12" HREF="#toc12">SEE ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
, <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
, <B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
,
<B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
, <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
. <P>
Fellbaum, C. (1998),
ed. <I>"WordNet: An Electronic Lexical Database" </I>. MIT Press, Cambridge, MA.
<H2><A NAME="sect13" HREF="#toc13">BUGS </A></H2>
Please report bugs to <B>wordnet@princeton.edu </B>. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<LI><A NAME="toc2" HREF="#sect2">LIST OF WORDNET LIBRARY FUNCTIONS</A></LI>
<UL>
<LI><A NAME="toc3" HREF="#sect3">Database Searching Functions (search.o)</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Morphology Functions (morph.o)</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Utility Functions (wnutil.o)</A></LI>
<LI><A NAME="toc6" HREF="#sect6">Binary Search Functions (binsrch.o)</A></LI>
</UL>
<LI><A NAME="toc7" HREF="#sect7">HEADER FILE</A></LI>
<LI><A NAME="toc8" HREF="#sect8">NOTES</A></LI>
<LI><A NAME="toc9" HREF="#sect9">ENVIRONMENT VARIABLES (UNIX)</A></LI>
<LI><A NAME="toc10" HREF="#sect10">REGISTRY (WINDOWS)</A></LI>
<LI><A NAME="toc11" HREF="#sect11">FILES</A></LI>
<LI><A NAME="toc12" HREF="#sect12">SEE ALSO</A></LI>
<LI><A NAME="toc13" HREF="#sect13">BUGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,71 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNINTRO(5WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnintro - introduction to descriptions of WordNet file formats
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>cntlist </B> - format of <B>cntlist </B> and <B>cntlist.rev </B> files <P>
<B>lexnames </B>
- list of lexicographer file names and numbers <P>
<B>prologdb </B> - description of
Prolog database files <P>
<B>senseidx </B> - format of sense index file <P>
<B>sensemap </B>
- mapping from senses in WordNet 2.1 to corresponding 3.0 senses <P>
<B>wndb </B> - format
of WordNet database files <P>
<B>wninput </B> - format of WordNet lexicographer files
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
This section of the <I>WordNet Reference Manual </I> contains manual
pages that describe the formats of the various files included in different
WordNet 3.0 packages.
<H2><A NAME="sect3" HREF="#toc3">NOMENCLATURE </A></H2>
All files are in ASCII. Fields are generally
separated by one space, unless otherwise noted, and each line is terminated
with a newline character. In the file format descriptions, terms in <I>italics
</I> refer to field names. Characters or strings in <B>boldface </B> represent an
actual character or string as it appears in the file. Items enclosed in
italicized square brackets (<I>[&nbsp;&nbsp;] </I>) may not be present. Since several files
contain fields that have the identical meaning, field names are consistently
defined. For example, several WordNet files contain one or more <I>synset_offset
</I> fields. In each case, the definition of <I>synset_offset </I> is identical.
<H2><A NAME="sect4" HREF="#toc4">SEE ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="cntlist.5WN.html">cntlist</B>(5WN)</A>
, <B><A HREF="lexnames.5WN.html">lexnames</B>(5WN)</A>
, <B><A HREF="prologdb.5WN.html">prologdb</B>(5WN)</A>
,
<B><A HREF="senseidx.5WN.html">senseidx</B>(5WN)</A>
, <B><A HREF="sensemap.5WN.html">sensemap</B>(5WN)</A>
, <B><A HREF="wndb.5WN.html">wndb</B>(5WN)</A>
, <B><A HREF="wninput.5WN.html">wninput</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
.
<P>
Fellbaum, C. (1998), ed. <I>"WordNet: An Electronic Lexical Database" </I>. MIT
Press, Cambridge, MA. <P>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">NOMENCLATURE</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,57 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNINTRO(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnintro - introduction to miscellaneous WordNet information
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS
</A></H2>
<P>
<B>morphy </B> - discussion of WordNet's morphological processing <P>
<B>uniqbeg </B> - unique
beginners for noun hierarchies <P>
<B>wngloss </B> - glossary of terms used in WordNet
<P>
<B>wngroups </B> - discussion of WordNet search code to group similar senses <P>
<B>wnlicens
</B> - text of WordNet license agreement <P>
<B>wnpkgs </B> - information about WordNet
packages and distribution <P>
<B>wnstats </B> - database statistics
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
This
section of the <I>WordNet Reference Manual </I> contains manual pages that describe
various topics related to WordNet and the semantic concordances, and a
glossary of terms.
<H2><A NAME="sect3" HREF="#toc3">SEE ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
,
<B><A HREF="morphy.7WN.html">morphy</B>(7WN)</A>
, <B><A HREF="uniqbeg.7WN.html">uniqbeg</B>(7WN)</A>
, <B><A HREF="wngroups.7WN.html">wngroups</B>(7WN)</A>
, <B><A HREF="wnlicens.7WN.html">wnlicens</B>(7WN)</A>
, <B><A HREF="wnpkgs.7WN.html">wnpkgs</B>(7WN)</A>
,
<B><A HREF="wnstats.7WN.html">wnstats</B>(7WN)</A>
, <B><A HREF="wngloss.7WN.html">wngloss</B>(7WN)</A>
. <P>
Fellbaum, C. (1998), ed. <I>"WordNet: An Electronic
Lexical Database" </I>. MIT Press, Cambridge, MA. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,45 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNLICENS(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnlicens - text of WordNet license
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
WordNet Release 3.0
<P>
This software and database is being provided to you, the LICENSEE, by
Princeton University under the following license. By obtaining, using
and/or copying this software and database, you agree that you have
read, understood, and will comply with these terms and conditions.:
Permission to use, copy, modify and distribute this software and
database and its documentation for any purpose and without fee or royalty
is hereby granted, provided that you agree to comply with the following
copyright notice and statements, including the disclaimer, and that
the same appear on ALL copies of the software, database and documentation,
including modifications that you make for internal use or for distribution.
WordNet 3.0 Copyright 2006 by Princeton University. All rights reserved.
THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON UNIVERSITY
MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF
EXAMPLE, BUT NOT LIMITATION, PRINCETON UNIVERSITY MAKES NO REPRESENTATIONS
OR WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE
OR THAT THE USE OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION
WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR
OTHER RIGHTS. The name of Princeton University or Princeton may
not be used in advertising or publicity pertaining to distribution of
the software and/or database. Title to copyright in this software, database
and any associated documentation shall at all times remain with Princeton
University and LICENSEE agrees to preserve same. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,95 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNPKGS(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnpkgs - description of various WordNet system packages
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION
</A></H2>
WordNet 3.0 is distributed in several formats and in various packages. All
of the packages are available via anonymous FTP from <B>ftp.cogsci.princeton.edu
</B> and from the WordNet Web site at <B><A HREF="http://wordnet.princeton.edu">http://wordnet.princeton.edu</A>
</B>.
<H3><A NAME="sect2" HREF="#toc2">Packages
Available Via FTP and WWW </A></H3>
The following WordNet packages can be downloaded
using a web browser from <B>ftp://ftp.cogsci.princeton.edu/wordnet/3.0 </B>, or from
the Web site noted above. Users can also FTP directly from <B>ftp.cogsci.princeton.edu
</B>, directory <B>wordnet/3.0 </B>. <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>Package </B> </TD> <TD ALIGN=CENTER><B>Filename </B> </TD> <TD ALIGN=CENTER><B>Platform </B> </TD> <TD ALIGN=CENTER><B>Description
</B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Database </TD> <TD ALIGN=LEFT><B>WordNet-3.0.tar.gz </B> </TD> <TD ALIGN=LEFT>Unix/OS X </TD> <TD ALIGN=LEFT>WordNet 3.0 database, interfaces,
sense index, interface and library source code, documentation. </TD> </TR>
<TR> <TD ALIGN=LEFT>Database
</TD> <TD ALIGN=LEFT><B>WordNet-3.0.exe </B> </TD> <TD ALIGN=LEFT>Windows </TD> <TD ALIGN=LEFT>WordNet 3.0 database, interfaces, sense index,
interface and library source code, documentation. </TD> </TR>
<TR> <TD ALIGN=LEFT>Prolog Database </TD>
<TD ALIGN=LEFT><B>WNprolog-3.0.tar.gz </B> </TD> <TD ALIGN=LEFT>All </TD> <TD ALIGN=LEFT>WordNet 3.0 database files in Prolog-readable format,
documentation. </TD> </TR>
<TR> <TD ALIGN=LEFT>Sense Map </TD> <TD ALIGN=LEFT><B>WNsnsmap-3.0.tar.gz </B> </TD> <TD ALIGN=LEFT>All </TD> <TD ALIGN=LEFT>Mapping of 2.1 to 3.0
senses, documentation. </TD> </TR>
</TABLE>
<P>
<H3><A NAME="sect3" HREF="#toc3">Database Package </A></H3>
The database package is a
complete installation for WordNet 3.0 users. It includes the 3.0 database
files, source code for the WordNet browsers and library, and documentation.
The other packages are not included - they must be downloaded and installed
separately. <P>
Note that with this version of WordNet for Unix platforms,
only source code is provided. Users should carefully read the README and
INSTALL files for detailed information on compiling WordNet and dependencies.
<P>
<H3><A NAME="sect4" HREF="#toc4">Prolog Database Package </A></H3>
The WordNet 3.0 database files are available
in this package in a Prolog-readable format. Documentation describing the
file format is included. This package is only downloadable in compressed
tar file format, although once unpackaged it can be used from Windows
systems since the files are in ASCII. Many Windows utilities, such as
WinZip, can deal with a compressed tar file.
<H3><A NAME="sect5" HREF="#toc5">Sense Map Package </A></H3>
To help
users automatically convert 2.1 noun and verb senses to their corresponding
3.0 senses, we provide sense mapping information in this package. This
package contains files to map polysemous and monosemous words, and documentation
that describes the format of these files. As with the Prolog database,
this package is only downloadable in compressed tar format, but the files
are also in ASCII.
<H2><A NAME="sect6" HREF="#toc6">NOTES </A></H2>
The lexicographer files and <B><A HREF="grind.1WN.html">grind</B>(1WN)</A>
program
are not generally distributed. <P>
All of the packages described above may
not be available at the time of release of the 3.0 database package.
<H2><A NAME="sect7" HREF="#toc7">SEE
ALSO </A></H2>
<B><A HREF="wnintro.1WN.html">wnintro</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">Packages Available Via FTP and WWW</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Database Package</A></LI>
<LI><A NAME="toc4" HREF="#sect4">Prolog Database Package</A></LI>
<LI><A NAME="toc5" HREF="#sect5">Sense Map Package</A></LI>
</UL>
<LI><A NAME="toc6" HREF="#sect6">NOTES</A></LI>
<LI><A NAME="toc7" HREF="#sect7">SEE ALSO</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,338 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNSEARCH(3WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
findtheinfo, findtheinfo_ds, is_defined, in_wn, index_lookup, parse_index,
getindex, read_synset, parse_synset, free_syns, free_synset, free_index,
traceptrs_ds, do_trace
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>#include "wn.h" <P>
<B>char *findtheinfo(char
*searchstr, int pos, int ptr_type, int sense_num); </B></B> <P>
<B>SynsetPtr findtheinfo_ds(char
*searchstr, int pos, int ptr_type, int sense_num ); </B> <P>
<B>unsigned int is_defined(char
*searchstr, int pos); </B> <P>
<B>unsigned int in_wn(char *searchstr, int pos); </B>
<P>
<B>IndexPtr index_lookup(char *searchstr, int pos); </B> <P>
<B>IndexPtr parse_index(long
offset, int dabase, char *line); </B> <P>
<B>IndexPtr getindex(char *searchstr, int
pos); </B> <P>
<B>SynsetPtr read_synset(int pos, long synset_offset, char *searchstr);
</B> <P>
<B>SynsetPtr parse_synset(FILE *fp, int pos, char *searchstr); </B> <P>
<B>void free_syns(SynsetPtr
synptr); </B> <P>
<B>void free_synset(SynsetPtr synptr); </B> <P>
<B>void free_index(IndexPtr
idx); </B> <P>
<B>SynsetPtr traceptrs_ds(SynsetPtr synptr, int ptr_type, int pos,
int depth); </B> <P>
<B>char *do_trace(SynsetPtr synptr, int ptr_type, int pos, int
depth); </B>
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION </A></H2>
<P>
These functions are used for searching the WordNet
database. They generally fall into several categories: functions for reading
and parsing index file entries; functions for reading and parsing synsets
in data files; functions for tracing pointers and hierarchies; functions
for freeing space occupied by data structures allocated with <B><A HREF="malloc.3.html">malloc</B>(3)</A>
.
<P>
In the following function descriptions, <I>pos </I> is one of the following:
<P>
<blockquote><B>1 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;NOUN <BR>
<B>2 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;VERB <BR>
<B>3 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADJECTIVE <BR>
<B>4 </B><tt> </tt>&nbsp;<tt> </tt>&nbsp;ADVERB <BR>
</blockquote>
<P>
<B>findtheinfo()</B> is the primary
search algorithm for use with database interface applications. Search
results are automatically formatted, and a pointer to the text buffer
is returned. All searches listed in <B>WNHOME/include/wn.h</B> can be done by
<B>findtheinfo()</B>. <B>findtheinfo_ds()</B> can be used to perform most of the searches,
with results returned in a linked list data structure. This is for use
with applications that need to analyze the search results rather than
just display them. <P>
Both functions are passed the same arguments: <I>searchstr
</I> is the word or collocation to search for; <I>pos </I> indicates the syntactic
category to search in; <I>ptr_type </I> is one of the valid search types for
<I>searchstr </I> in <I>pos </I>. (Available searches can be obtained by calling <B>is_defined()</B>
described below.) <I>sense_num </I> should be <FONT SIZE=-1><B>ALLSENSES </B></FONT>
if the search is to
be done on all senses of <I>searchstr </I> in <I>pos </I>, or a positive integer indicating
which sense to search. <P>
<B>findtheinfo_ds() </B> returns a linked list data structures
representing synsets. Senses are linked through the <I>nextss </I> field of a
<B>Synset </B> data structure. For each sense, synsets that match the search
specified with <I>ptr_type </I> are linked through the <I>ptrlist </I> field. See <FONT SIZE=-1><B>Synset
Navigation </B></FONT>
below, for detailed information on the linked lists returned.
<P>
<B>is_defined() </B> sets a bit for each search type that is valid for <I>searchstr
</I> in <I>pos </I>, and returns the resulting unsigned integer. Each bit number
corresponds to a pointer type constant defined in <B>WNHOME/include/wn.h </B>.
For example, if bit 2 is set, the <FONT SIZE=-1><B>HYPERPTR </B></FONT>
search is valid for <I>searchstr
</I>. There are 29 possible searches. <P>
<B>in_wn() </B> is used to find the syntactic
categories in the WordNet database that contain one or more senses of
<I>searchstr </I>. If <I>pos </I> is <FONT SIZE=-1><B>ALL_POS, </B></FONT>
all syntactic categories are checked.
Otherwise, only the part of speech passed is checked. An unsigned integer
is returned with a bit set corresponding to each syntactic category containing
<I>searchstr </I>. The bit number matches the number for the part of speech.
<B>0 </B> is returned if <I>searchstr </I> is not present in <I>pos </I>. <P>
<B>index_lookup() </B> finds
<I>searchstr </I> in the index file for <I>pos </I> and returns a pointer to the parsed
entry in an <B>Index </B> data structure. <I>searchstr </I> must exactly match the form
of the word (lower case only, hyphens and underscores in the same places)
in the index file. <FONT SIZE=-1><B>NULL </B></FONT>
is returned if a match is not found. <P>
<B>parse_index()
</B> parses an entry from an index file and returns a pointer to the parsed
entry in an <B>Index </B> data structure. Passed the byte <I>offset </I> and syntactic
category, it reads the index entry at the desired location in the corresponding
file. If passed <I>line </I>, <I>line </I> contains an index file entry and the database
index file is not consulted. However, <I>offset </I> and <I>dbase </I> should still
be passed so the information can be stored in the <B>Index </B> structure. <P>
<B>getindex()
</B> is a "smart" search for <I>searchstr </I> in the index file corresponding to
<I>pos </I>. It applies to <I>searchstr </I> an algorithm that replaces underscores
with hyphens, hyphens with underscores, removes hyphens and underscores,
and removes periods in an attempt to find a form of the string that is
an exact match for an entry in the index file corresponding to <I>pos </I>. <B>index_lookup()
</B> is called on each transformed string until a match is found or all the
different strings have been tried. It returns a pointer to the parsed
<B>Index </B> data structure for <I>searchstr </I>, or <FONT SIZE=-1><B>NULL </B></FONT>
if a match is not found.
<P>
<B>read_synset() </B> is used to read a synset from a byte offset in a data
file. It performs an <B><A HREF="fseek.3.html">fseek </B>(3)</A>
to <I>synset_offset </I> in the data file corresponding
to <I>pos </I>, and calls <B>parse_synset() </B> to read and parse the synset. A pointer
to the <B>Synset </B> data structure containing the parsed synset is returned.
<P>
<B>parse_synset() </B> reads the synset at the current offset in the file indicated
by <I>fp </I>. <I>pos </I> is the syntactic category, and <I>searchstr </I>, if not <FONT SIZE=-1><B>NULL, </B></FONT>
indicates the word in the synset that the caller is interested in. An
attempt is made to match <I>searchstr </I> to one of the words in the synset.
If an exact match is found, the <I>whichword </I> field in the <B>Synset </B> structure
is set to that word's number in the synset (beginning to count from <B>1 </B>).
<P>
<B>free_syns() </B> is used to free a linked list of <B>Synset </B> structures allocated
by <B>findtheinfo_ds() </B>. <I>synptr </I> is a pointer to the list to free. <P>
<B>free_synset()
</B> frees the <B>Synset </B> structure pointed to by <I>synptr </I>. <P>
<B>free_index() </B> frees
the <B>Index </B> structure pointed to by <I>idx </I>. <P>
<B>traceptrs_ds() </B> is a recursive
search algorithm that traces pointers matching <I>ptr_type </I> starting with
the synset pointed to by <I>synptr </I>. Setting <I>depth </I> to <B>1 </B> when <B>traceptrs_ds()
</B> is called indicates a recursive search; <B>0 </B> indicates a non-recursive call.
<I>synptr </I> points to the data structure representing the synset to search
for a pointer of type <I>ptr_type </I>. When a pointer type match is found, the
synset pointed to is read is linked onto the <I>nextss </I> chain. Levels of
the tree generated by a recursive search are linked via the <I>ptrlist </I> field
structure until <FONT SIZE=-1><B>NULL </B></FONT>
is found, indicating the top (or bottom) of the
tree. This function is usually called from <B>findtheinfo_ds() </B> for each
sense of the word. See <FONT SIZE=-1><B>Synset Navigation </B></FONT>
below, for detailed information
on the linked lists returned. <P>
<B>do_trace() </B> performs the search indicated
by <I>ptr_type </I> on synset synptr in syntactic category <I>pos </I>. <I>depth </I> is
defined as above. <B>do_trace() </B> returns the search results formatted in
a text buffer.
<H3><A NAME="sect3" HREF="#toc3">Synset Navigation </A></H3>
Since the <B>Synset </B> structure is used to
represent the synsets for both word senses and pointers, the <I>ptrlist </I>
and <I>nextss </I> fields have different meanings depending on whether the structure
is a word sense or pointer. This can make navigation through the lists
returned by <B>findtheinfo_ds() </B> confusing. <P>
Navigation through the returned
list involves the following: <P>
Following the <I>nextss </I> chain from the synset
returned moves through the various senses of <I>searchstr </I>. <FONT SIZE=-1><B>NULL </B></FONT>
indicates
that end of the chain of senses. <P>
Following the <I>ptrlist </I> chain from a <B>Synset
</B> structure representing a sense traces the hierarchy of the search results
for that sense. Subsequent links in the <I>ptrlist </I> chain indicate the next
level (up or down, depending on the search) in the hierarchy. <FONT SIZE=-1><B>NULL </B></FONT>
indicates
the end of the chain of search result synsets. <P>
If a synset pointed to
by <I>ptrlist </I> has a value in the <I>nextss </I> field, it represents another pointer
of the same type at that level in the hierarchy. For example, some noun
synsets have two hypernyms. Following this <I>nextss </I> pointer, and then the
<I>ptrlist </I> chain from the <B>Synset </B> structure pointed to, traces another,
parallel, hierarchy, until the end is indicated by <FONT SIZE=-1><B>NULL </B></FONT>
on that <I>ptrlist
</I> chain. So, a <B>synset </B> representing a pointer (versus a sense of <I>searchstr
</I>) having a non-NULL value in <I>nextss </I> has another chain of search results
linked through the <I>ptrlist </I> chain of the synset pointed to by <I>nextss </I>.
<P>
If <I>searchstr </I> contains more than one base form in WordNet (as in the
noun <B>axes </B>, which has base forms <B>axe </B> and <B>axis </B>), synsets representing
the search results for each base form are linked through the <I>nextform
</I> pointer of the <B>Synset </B> structure.
<H3><A NAME="sect4" HREF="#toc4">WordNet Searches </A></H3>
There is no extensive
description of what each search type is or the results returned. Using
the WordNet interface, examining the source code, and reading <B><A HREF="wndb.5WN.html">wndb</B>(5WN)<B></B></A>
are the best ways to see what types of searches are available and the
data returned for each. <P>
Listed below are the valid searches that can be
passed as <I>ptr_type </I> to <B>findtheinfo() </B>. Passing a negative value (when
applicable) causes a recursive, hierarchical search by setting <I>depth </I>
to <B>1 </B> when <B>traceptrs() </B> is called. <P>
<TABLE BORDER=0>
<TR> <TD ALIGN=LEFT><B>ptr_type </B> </TD> <TD ALIGN=CENTER><B>Value </B> </TD> <TD ALIGN=CENTER><B>Pointer </B> </TD> <TD ALIGN=LEFT><B>Search
</B> </TD> </TR>
<TR> <TD ALIGN=LEFT> </TD> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Symbol </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>ANTPTR </TD> <TD ALIGN=CENTER>1 </TD> <TD ALIGN=CENTER>! </TD> <TD ALIGN=LEFT>Antonyms </TD> </TR>
<TR> <TD ALIGN=LEFT>HYPERPTR </TD> <TD ALIGN=CENTER>2 </TD> <TD ALIGN=CENTER>@ </TD> <TD ALIGN=LEFT>Hypernyms
</TD> </TR>
<TR> <TD ALIGN=LEFT>HYPOPTR </TD> <TD ALIGN=CENTER>3 </TD> <TD ALIGN=CENTER>&nbsp; </TD> <TD ALIGN=LEFT>Hyponyms </TD> </TR>
<TR> <TD ALIGN=LEFT>ENTAILPTR </TD> <TD ALIGN=CENTER>4 </TD> <TD ALIGN=CENTER>* </TD> <TD ALIGN=LEFT>Entailment </TD> </TR>
<TR> <TD ALIGN=LEFT>SIMPTR </TD> <TD ALIGN=CENTER>5
</TD> <TD ALIGN=CENTER>&amp; </TD> <TD ALIGN=LEFT>Similar </TD> </TR>
<TR> <TD ALIGN=LEFT>ISMEMBERPTR </TD> <TD ALIGN=CENTER>6 </TD> <TD ALIGN=CENTER>#m </TD> <TD ALIGN=LEFT>Member meronym </TD> </TR>
<TR> <TD ALIGN=LEFT>ISSTUFFPTR </TD> <TD ALIGN=CENTER>7 </TD> <TD ALIGN=CENTER>#s
</TD> <TD ALIGN=LEFT>Substance meronym </TD> </TR>
<TR> <TD ALIGN=LEFT>ISPARTPTR </TD> <TD ALIGN=CENTER>8 </TD> <TD ALIGN=CENTER>#p </TD> <TD ALIGN=LEFT>Part meronym </TD> </TR>
<TR> <TD ALIGN=LEFT>HASMEMBERPTR </TD>
<TD ALIGN=CENTER>9 </TD> <TD ALIGN=CENTER>%m </TD> <TD ALIGN=LEFT>Member holonym </TD> </TR>
<TR> <TD ALIGN=LEFT>HASSTUFFPTR </TD> <TD ALIGN=CENTER>10 </TD> <TD ALIGN=CENTER>%s </TD> <TD ALIGN=LEFT>Substance holonym </TD> </TR>
<TR> <TD ALIGN=LEFT>HASPARTPTR
</TD> <TD ALIGN=CENTER>11 </TD> <TD ALIGN=CENTER>%p </TD> <TD ALIGN=LEFT>Part holonym </TD> </TR>
<TR> <TD ALIGN=LEFT>MERONYM </TD> <TD ALIGN=CENTER>12 </TD> <TD ALIGN=CENTER>% </TD> <TD ALIGN=LEFT>All meronyms </TD> </TR>
<TR> <TD ALIGN=LEFT>HOLONYM </TD> <TD ALIGN=CENTER>13 </TD>
<TD ALIGN=CENTER># </TD> <TD ALIGN=LEFT>All holonyms </TD> </TR>
<TR> <TD ALIGN=LEFT>CAUSETO </TD> <TD ALIGN=CENTER>14 </TD> <TD ALIGN=CENTER>&gt; </TD> <TD ALIGN=LEFT>Cause </TD> </TR>
<TR> <TD ALIGN=LEFT>PPLPTR </TD> <TD ALIGN=CENTER>15 </TD> <TD ALIGN=CENTER>&lt; </TD> <TD ALIGN=LEFT>Participle of
verb </TD> </TR>
<TR> <TD ALIGN=LEFT>SEEALSOPTR </TD> <TD ALIGN=CENTER>16 </TD> <TD ALIGN=CENTER>^ </TD> <TD ALIGN=LEFT>Also see </TD> </TR>
<TR> <TD ALIGN=LEFT>PERTPTR </TD> <TD ALIGN=CENTER>17 </TD> <TD ALIGN=CENTER>\ </TD> <TD ALIGN=LEFT>Pertains to noun
or derived from adjective </TD> </TR>
<TR> <TD ALIGN=LEFT>ATTRIBUTE </TD> <TD ALIGN=CENTER>18 </TD> <TD ALIGN=CENTER>\= </TD> <TD ALIGN=LEFT>Attribute </TD> </TR>
<TR> <TD ALIGN=LEFT>VERBGROUP
</TD> <TD ALIGN=CENTER>19 </TD> <TD ALIGN=CENTER>$ </TD> <TD ALIGN=LEFT>Verb group </TD> </TR>
<TR> <TD ALIGN=LEFT>DERIVATION </TD> <TD ALIGN=CENTER>20 </TD> <TD ALIGN=CENTER>+ </TD> <TD ALIGN=LEFT>Derivationally related form </TD>
</TR>
<TR> <TD ALIGN=LEFT>CLASSIFICATION </TD> <TD ALIGN=CENTER>21 </TD> <TD ALIGN=CENTER>; </TD> <TD ALIGN=LEFT>Domain of synset </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASS </TD> <TD ALIGN=CENTER>22 </TD> <TD ALIGN=CENTER>- </TD> <TD ALIGN=LEFT>Member of this
domain </TD> </TR>
<TR> <TD ALIGN=LEFT>SYNS </TD> <TD ALIGN=CENTER>23 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Find synonyms </TD> </TR>
<TR> <TD ALIGN=LEFT>FREQ </TD> <TD ALIGN=CENTER>24 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Polysemy </TD>
</TR>
<TR> <TD ALIGN=LEFT>FRAMES </TD> <TD ALIGN=CENTER>25 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Verb example sentences and generic frames </TD> </TR>
<TR> <TD ALIGN=LEFT>COORDS
</TD> <TD ALIGN=CENTER>26 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Noun coordinates </TD> </TR>
<TR> <TD ALIGN=LEFT>RELATIVES </TD> <TD ALIGN=CENTER>27 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Group related senses
</TD> </TR>
<TR> <TD ALIGN=LEFT>HMERONYM </TD> <TD ALIGN=CENTER>28 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Hierarchical meronym search </TD> </TR>
<TR> <TD ALIGN=LEFT>HHOLONYM </TD> <TD ALIGN=CENTER>29 </TD> <TD ALIGN=CENTER><I>n/a
</I> </TD> <TD ALIGN=LEFT>Hierarchical holonym search </TD> </TR>
<TR> <TD ALIGN=LEFT>WNGREP </TD> <TD ALIGN=CENTER>30 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Find keywords by substring
</TD> </TR>
<TR> <TD ALIGN=LEFT>OVERVIEW </TD> <TD ALIGN=CENTER>31 </TD> <TD ALIGN=CENTER><I>n/a </I> </TD> <TD ALIGN=LEFT>Show all synsets for word </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASSIF_CATEGORY </TD>
<TD ALIGN=CENTER>32 </TD> <TD ALIGN=CENTER>;c </TD> <TD ALIGN=LEFT>Show domain topic </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASSIF_USAGE </TD> <TD ALIGN=CENTER>33 </TD> <TD ALIGN=CENTER>;u </TD> <TD ALIGN=LEFT>Show domain usage
</TD> </TR>
<TR> <TD ALIGN=LEFT>CLASSIF_REGIONAL </TD> <TD ALIGN=CENTER>34 </TD> <TD ALIGN=CENTER>;r </TD> <TD ALIGN=LEFT>Show domain region </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASS_CATEGORY </TD> <TD ALIGN=CENTER>35
</TD> <TD ALIGN=CENTER>-c </TD> <TD ALIGN=LEFT>Show domain terms for topic </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASS_USAGE </TD> <TD ALIGN=CENTER>36 </TD> <TD ALIGN=CENTER>-u </TD> <TD ALIGN=LEFT>Show domain terms
for usage </TD> </TR>
<TR> <TD ALIGN=LEFT>CLASS_REGIONAL </TD> <TD ALIGN=CENTER>37 </TD> <TD ALIGN=CENTER>-r </TD> <TD ALIGN=LEFT>Show domain terms for region </TD> </TR>
<TR> <TD ALIGN=LEFT>INSTANCE
</TD> <TD ALIGN=CENTER>38 </TD> <TD ALIGN=CENTER>@i </TD> <TD ALIGN=LEFT>Instance of </TD> </TR>
<TR> <TD ALIGN=LEFT>INSTANCES </TD> <TD ALIGN=CENTER>39 </TD> <TD ALIGN=CENTER>&nbsp;i </TD> <TD ALIGN=LEFT>Show instances </TD> </TR>
</TABLE>
<P>
<B>findtheinfo_ds()
</B> cannot perform the following searches: <P>
<blockquote>SEEALSOPTR <BR>
PERTPTR <BR>
VERBGROUP
<BR>
FREQ <BR>
FRAMES <BR>
RELATIVES <BR>
WNGREP <BR>
OVERVIEW <BR>
</blockquote>
<H2><A NAME="sect5" HREF="#toc5">NOTES </A></H2>
Applications that
use WordNet and/or the morphological functions must call <B>wninit() </B> at
the start of the program. See <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A>
for more information. <P>
In all
function calls, <I>searchstr </I> may be either a word or a collocation formed
by joining individual words with underscore characters (<B>_ </B>). <P>
The <B>SearchResults
</B> structure defines fields in the <I>wnresults </I> global variable that are set
by the various search functions. This is a way to get additional information,
such as the number of senses the word has, from the search functions. The
<I>searchds </I> field is set by <B>findtheinfo_ds() </B>. <P>
The <I>pos </I> passed to <B>traceptrs_ds()
</B> is not used. <P>
<H2><A NAME="sect6" HREF="#toc6">SEE ALSO </A></H2>
<B><A HREF="wn.1WN.html">wn</B>(1WN)</A>
, <B><A HREF="wnb.1WN.html">wnb</B>(1WN)</A>
, <B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
, <B><A HREF="binsrch.3WN.html">binsrch</B>(3WN)</A>
,
<B><A HREF="malloc.3.html">malloc</B>(3)</A>
, <B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
, <B><A HREF="wnutil.3WN.html">wnutil</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
.
<H2><A NAME="sect7" HREF="#toc7">WARNINGS </A></H2>
<B>parse_synset()
</B> must find an exact match between the <I>searchstr </I> passed and a word in
the synset to set <I>whichword </I>. No attempt is made to translate hyphens
and underscores, as is done in <B>getindex() </B>. <P>
The WordNet database and exception
list files must be opened with <B>wninit </B> prior to using any of the searching
functions. <P>
A large search may cause <B>findtheinfo() </B> to run out of buffer
space. The maximum buffer size is determined by computer platform. If the
buffer size is exceeded the following message is printed in the output
buffer: <B>"Search too large. Narrow search and try again..." </B>. <P>
Passing an invalid
<I>pos </I> will probably result in a core dump. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc3" HREF="#sect3">Synset Navigation</A></LI>
<LI><A NAME="toc4" HREF="#sect4">WordNet Searches</A></LI>
</UL>
<LI><A NAME="toc5" HREF="#sect5">NOTES</A></LI>
<LI><A NAME="toc6" HREF="#sect6">SEE ALSO</A></LI>
<LI><A NAME="toc7" HREF="#sect7">WARNINGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,80 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNSTATS(7WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wnstats - WordNet 3.0 database statistics
<H2><A NAME="sect1" HREF="#toc1">DESCRIPTION </A></H2>
<H3><A NAME="sect2" HREF="#toc2">Number of
words, synsets, and senses </A></H3>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>POS </B> </TD> <TD ALIGN=CENTER><B>Unique </B> </TD> <TD ALIGN=CENTER><B>Synsets </B> </TD> <TD ALIGN=CENTER><B>Total </B> </TD> </TR>
<TR> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Strings
</B> </TD> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Word-Sense Pairs </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Noun </TD> <TD ALIGN=RIGHT>117798 </TD> <TD ALIGN=RIGHT>82115 </TD> <TD ALIGN=RIGHT>146312 </TD> </TR>
<TR> <TD ALIGN=LEFT>Verb </TD> <TD ALIGN=RIGHT>11529 </TD>
<TD ALIGN=RIGHT>13767 </TD> <TD ALIGN=RIGHT>25047 </TD> </TR>
<TR> <TD ALIGN=LEFT>Adjective </TD> <TD ALIGN=RIGHT>21479 </TD> <TD ALIGN=RIGHT>18156 </TD> <TD ALIGN=RIGHT>30002 </TD> </TR>
<TR> <TD ALIGN=LEFT>Adverb </TD> <TD ALIGN=RIGHT>4481 </TD> <TD ALIGN=RIGHT>3621 </TD>
<TD ALIGN=RIGHT>5580 </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Totals </TD> <TD ALIGN=RIGHT>155287 </TD> <TD ALIGN=RIGHT>117659 </TD> <TD ALIGN=RIGHT>206941 </TD> </TR>
</TABLE>
<P>
<H3><A NAME="sect3" HREF="#toc3">Polysemy information </A></H3>
<P>
<TABLE BORDER=0>
<TR>
<TD ALIGN=CENTER><B>POS </B> </TD> <TD ALIGN=CENTER><B>Monosemous </B> </TD> <TD ALIGN=CENTER><B>Polysemous </B> </TD> <TD ALIGN=CENTER><B>Polysemous </B> </TD> </TR>
<TR> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Words and Senses </B> </TD> <TD ALIGN=CENTER><B>Words
</B> </TD> <TD ALIGN=CENTER><B>Senses </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Noun </TD> <TD ALIGN=RIGHT>101863 </TD> <TD ALIGN=RIGHT>15935 </TD> <TD ALIGN=RIGHT>44449 </TD> </TR>
<TR> <TD ALIGN=LEFT>Verb </TD> <TD ALIGN=RIGHT>6277 </TD> <TD ALIGN=RIGHT>5252 </TD> <TD ALIGN=RIGHT>18770 </TD>
</TR>
<TR> <TD ALIGN=LEFT>Adjective </TD> <TD ALIGN=RIGHT>16503 </TD> <TD ALIGN=RIGHT>4976 </TD> <TD ALIGN=RIGHT>14399 </TD> </TR>
<TR> <TD ALIGN=LEFT>Adverb </TD> <TD ALIGN=RIGHT>3748 </TD> <TD ALIGN=RIGHT>733 </TD> <TD ALIGN=RIGHT>1832 </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Totals
</TD> <TD ALIGN=RIGHT>128391 </TD> <TD ALIGN=RIGHT>26896 </TD> <TD ALIGN=RIGHT>79450 </TD> </TR>
</TABLE>
<P>
<TABLE BORDER=0>
<TR> <TD ALIGN=CENTER><B>POS </B> </TD> <TD ALIGN=CENTER><B>Average Polysemy </B> </TD> <TD ALIGN=CENTER><B>Average Polysemy
</B> </TD> </TR>
<TR> <TD ALIGN=CENTER> </TD> <TD ALIGN=CENTER><B>Including Monosemous Words </B> </TD> <TD ALIGN=CENTER><B>Excluding Monosemous Words </B> </TD> </TR>
<TR> <TR> <TD ALIGN=LEFT>Noun
</TD> <TD ALIGN=RIGHT>1.24 </TD> <TD ALIGN=RIGHT>2.79 </TD> </TR>
<TR> <TD ALIGN=LEFT>Verb </TD> <TD ALIGN=RIGHT>2.17 </TD> <TD ALIGN=RIGHT>3.57 </TD> </TR>
<TR> <TD ALIGN=LEFT>Adjective </TD> <TD ALIGN=RIGHT>1.40 </TD> <TD ALIGN=RIGHT>2.71 </TD> </TR>
<TR> <TD ALIGN=LEFT>Adverb </TD> <TD ALIGN=RIGHT>1.25 </TD> <TD ALIGN=RIGHT>2.50
</TD> </TR>
</TABLE>
<P>
<H2><A NAME="sect4" HREF="#toc4">NOTES </A></H2>
Statistics for all types of adjectives and adjective satellites
are combined. <P>
The total of all unique noun, verb, adjective, and adverb
strings is actually 147278. However, many strings are unique within a syntactic
category, but are in more than one syntactic category. The figures in
the table represent the unique strings in each syntactic category. <P>
<P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">DESCRIPTION</A></LI>
<UL>
<LI><A NAME="toc2" HREF="#sect2">Number of words, synsets, and senses</A></LI>
<LI><A NAME="toc3" HREF="#sect3">Polysemy information</A></LI>
</UL>
<LI><A NAME="toc4" HREF="#sect4">NOTES</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,154 +0,0 @@
<!-- manual page source format generated by PolyglotMan v3.0.3a12, -->
<!-- available via anonymous ftp from ftp.cs.berkeley.edu:/ucb/people/phelps/tcltk/rman.tar.Z -->
<HTML>
<HEAD>
<TITLE>WNUTIL(3WN) manual page</TITLE>
</HEAD>
<BODY>
<A HREF="#toc">Table of Contents</A><P>
<H2><A NAME="sect0" HREF="#toc0">NAME </A></H2>
wninit, re_wninit, cntwords, strtolower, ToLowerCase, strsubst,
getptrtype, getpos, getsstype, StrToPos, GetSynsetForSense, GetDataOffset,
GetPolyCount, WNSnsToStr, GetValidIndexPointer, GetWNSense, GetSenseIndex,
default_display_message
<H2><A NAME="sect1" HREF="#toc1">SYNOPSIS </A></H2>
<P>
<B>#include "wn.h" </B> <P>
<B>int wninit(void); </B> <P>
<B>int
re_wninit(void); </B> <P>
<B>int cntwords(char *str, char separator); </B> <P>
<B>char *strtolower(char
*str); </B> <P>
<B>char *ToLowerCase(char *str); </B> <P>
<B>char *strsubst(char *str, char
from, char to); </B> <P>
<B>int getptrtype(char *ptr_symbol); </B> <P>
<B>int getpos(char *ss_type);
</B> <P>
<B>int getsstype(char *ss_type); </B> <P>
<B>int StrToPos(char pos); </B> <P>
<B>SynsetPtr GetSynsetForSense(char
*sense_key); </B> <P>
<B>long GetDataOffset(char *sense_key); </B> <P>
<B>int GetPolyCount(char
*sense_key); </B> <P>
<B>char *WNSnsToStr(IndexPtr idx, int sense_num); </B> <P>
<B>IndexPtr
GetValidIndexPointer(char *str, int pos); </B> <P>
<B>int GetWNSense(char *lemma,
*lex_sense); </B> <P>
<B>SnsIndexPtr GetSenseIndex(char *sense_key); </B> <P>
<B>int GetTagcnt(IndexPtr
idx, int sense); </B> <P>
<B>int default_display_message(char *msg); </B>
<H2><A NAME="sect2" HREF="#toc2">DESCRIPTION
</A></H2>
<P>
The WordNet library contains many utility functions used by the interface
code, other library functions, and various applications and tools. Only
those of importance to the WordNet search code, or which are generally
useful are described here. <P>
<B>wninit()</B> opens the files necessary for using
WordNet with the WordNet library functions. The database files are opened,
and <B>morphinit()</B> is called to open the exception list files. Returns <B>0
</B> if successful, <B>-1 </B> otherwise. The database and exception list files must
be open before the WordNet search and morphology functions are used. If
the database is successfully opened, the global variable <B>OpenDB </B> is set
to <B>1 </B>. Note that it is possible for the database files to be opened (<B>OpenDB
== 1 </B>), but not the exception list files. <P>
<B>re_wninit()</B> is used to close
the database files and reopen them, and is used exclusively for WordNet
development. <B>re_morphinit() </B> is called to close and reopen the exception
list files. Return codes are as described above. <P>
<B>cntwords()</B> counts the
number of underscore or space separated words in <I>str </I>. A hyphen is passed
in <I>separator </I> if is is to be considered a word delimiter. Otherwise <I>separator
</I> can be any other character, or an underscore if another character is
not desired. <P>
<B>strtolower()</B> converts <I>str </I> to lower case and removes a trailing
adjective marker, if present. <I>str </I> is actually modified by this function,
and a pointer to the modified string is returned. <P>
<B>ToLowerCase()</B> converts
<I>str </I> to lower case as above, without removing an adjective marker. <P>
<B>strsubst()</B>
replaces all occurrences of <I>from </I> with <I>to </I> in <I>str </I> and returns resulting
string. <P>
<B>getptrtype()</B> returns the integer <I>ptr_type </I> corresponding to the
pointer character passed in <I>ptr_symbol </I>. See <B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
for a table
of pointer symbols and types. <P>
<B>getpos()</B> returns the integer constant corresponding
to the synset type passed. <I>ss_type </I> may be one of the following: <B>n, v,
a, r, s </B>. If <B>s </B> is passed, <FONT SIZE=-1><B>ADJ </B></FONT>
is returned. Exits with <B>-1 </B> if <I>ss_type
</I> is invalid. <P>
<B>getsstype()</B> works like <B>getpos() </B>, but returns <FONT SIZE=-1><B>SATELLITE </B></FONT>
if <I>ss_type </I> is <B>s </B>. <P>
<B>StrToPos()</B> returns the integer constant corresponding
to the syntactic category passed in <I>pos </I>. <I>string </I> must be one of the following:
<B>noun, verb, adj, adv </B>. <B>-1 </B> is returned if <I>pos </I> is invalid. <P>
<B>GetSynsetForSense()</B>
returns the synset that contains the word sense <I>sense_key </I> and <FONT SIZE=-1><B>NULL </B></FONT>
in case of error. <P>
<B>GetDataOffset()</B> returns the synset offset for synset
that contains the word sense <I>sense_key </I>, and <B>0 </B> if <I>sense_key </I> is not in
sense index file. <P>
<B>GetPolyCount()</B> returns the polysemy count (number of
senses in WordNet) for <I>lemma </I> encoded in <I>sense_key </I> and <B>0 </B> if word is
not found. <P>
<B>WNSnsToStr()</B> returns sense key encoding for <I>sense_num </I> entry
in <I>idx </I>. <P>
<B>GetValidIndexPointer()</B> returns the Index structure for <I>word </I>
in <I>pos </I>. Calls <B><A HREF="morphstr.3WN.html">morphstr</B>(3WN)</A>
to find a valid base form if <I>word </I> is inflected.
<P>
<B>GetWNSense()</B> returns the WordNet sense number for the sense key encoding
represented by <I>lemma </I> and <I>lex_sense </I>. <P>
<B>GetSenseIndex()</B> returns parsed sense
index entry for <I>sense_key </I> and <FONT SIZE=-1><B>NULL </B></FONT>
if <I>sense_key </I> is not in sense index.
<P>
<B>GetTagcnt()</B> returns the number of times the sense passed has been tagged
according to the <I>cntlist </I> file. <P>
<B>default_display_message()</B> simply returns
<B>-1 </B>. This is the default value for the global variable <B>display_message
</B>, that points to a function to call to display an error message. In general,
applications (including the WordNet interfaces) define an application
specific function and set <B>display_message </B> to point to it.
<H2><A NAME="sect3" HREF="#toc3">NOTES </A></H2>
<B>include/wn.h
</B> lists all the pointer and search types and their corresponding constant
values. There is no description of what each search type is or the results
returned. Using the WordNet interface is the best way to see what types
of searches are available, and the data returned for each.
<H2><A NAME="sect4" HREF="#toc4">SEE ALSO </A></H2>
<B><A HREF="wnintro.3WN.html">wnintro</B>(3WN)</A>
,
<B><A HREF="wnsearch.3WN.html">wnsearch</B>(3WN)</A>
, <B><A HREF="morph.3WN.html">morph</B>(3WN)</A>
, <B><A HREF="wnintro.5WN.html">wnintro</B>(5WN)</A>
, <B><A HREF="wnintro.7WN.html">wnintro</B>(7WN)</A>
. <P>
<H2><A NAME="sect5" HREF="#toc5">WARNINGS </A></H2>
Error
checking on passed arguments is not rigorous. Passing <FONT SIZE=-1><B>NULL </B></FONT>
pointers
or invalid values will often cause an application to die. <P>
<HR><P>
<A NAME="toc"><B>Table of Contents</B></A><P>
<UL>
<LI><A NAME="toc0" HREF="#sect0">NAME</A></LI>
<LI><A NAME="toc1" HREF="#sect1">SYNOPSIS</A></LI>
<LI><A NAME="toc2" HREF="#sect2">DESCRIPTION</A></LI>
<LI><A NAME="toc3" HREF="#sect3">NOTES</A></LI>
<LI><A NAME="toc4" HREF="#sect4">SEE ALSO</A></LI>
<LI><A NAME="toc5" HREF="#sect5">WARNINGS</A></LI>
</UL>
</BODY></HTML>

View File

@ -1,478 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# doc/man/Makefile. Generated from Makefile.in by configure.
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
srcdir = .
top_srcdir = ../..
pkgdatadir = $(datadir)/WordNet
pkglibdir = $(libdir)/WordNet
pkgincludedir = $(includedir)/WordNet
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = /usr/csl/bin/install -c
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/man
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
man1dir = $(mandir)/man1
am__installdirs = "$(DESTDIR)$(man1dir)" "$(DESTDIR)$(man3dir)" \
"$(DESTDIR)$(man5dir)" "$(DESTDIR)$(man7dir)"
man3dir = $(mandir)/man3
man5dir = $(mandir)/man5
man7dir = $(mandir)/man7
NROFF = nroff
MANS = $(man_MANS)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run aclocal-1.9
AMDEP_FALSE = #
AMDEP_TRUE =
AMTAR = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run tar
AUTOCONF = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoconf
AUTOHEADER = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoheader
AUTOMAKE = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run automake-1.9
AWK = nawk
CC = gcc
CCDEPMODE = depmode=gcc3
CFLAGS = -g -O2
CPP = gcc -E
CPPFLAGS =
CYGPATH_W = echo
DEFS = -DHAVE_CONFIG_H
DEPDIR = .deps
ECHO_C =
ECHO_N = -n
ECHO_T =
EGREP = egrep
EXEEXT =
INSTALL_DATA = ${INSTALL} -m 644
INSTALL_PROGRAM = ${INSTALL}
INSTALL_SCRIPT = ${INSTALL}
INSTALL_STRIP_PROGRAM = ${SHELL} $(install_sh) -c -s
LDFLAGS =
LIBOBJS =
LIBS =
LTLIBOBJS =
MAKEINFO = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run makeinfo
OBJEXT = o
PACKAGE = WordNet
PACKAGE_BUGREPORT = wordnet@princeton.edu
PACKAGE_NAME = WordNet
PACKAGE_STRING = WordNet 3.0
PACKAGE_TARNAME = wordnet
PACKAGE_VERSION = 3.0
PATH_SEPARATOR = :
RANLIB = ranlib
SET_MAKE =
SHELL = /bin/bash
STRIP =
TCL_INCLUDE_SPEC = -I/usr/csl/include
TCL_LIB_SPEC = -L/usr/csl/lib -ltcl8.4
TK_LIBS = -L/usr/openwin/lib -lX11 -ldl -lpthread -lsocket -lnsl -lm
TK_LIB_SPEC = -L/usr/csl/lib -ltk8.4
TK_PREFIX = /usr/csl
TK_XINCLUDES = -I/usr/openwin/include
VERSION = 3.0
ac_ct_CC = gcc
ac_ct_RANLIB = ranlib
ac_ct_STRIP =
ac_prefix = /usr/local/WordNet-3.0
am__fastdepCC_FALSE = #
am__fastdepCC_TRUE =
am__include = include
am__leading_dot = .
am__quote =
am__tar = ${AMTAR} chof - "$$tardir"
am__untar = ${AMTAR} xf -
bindir = ${exec_prefix}/bin
build_alias =
datadir = ${prefix}/share
exec_prefix = ${prefix}
host_alias =
includedir = ${prefix}/include
infodir = ${prefix}/info
install_sh = /people/wn/src/Release/3.0/Unix/install-sh
libdir = ${exec_prefix}/lib
libexecdir = ${exec_prefix}/libexec
localstatedir = ${prefix}/var
mandir = ${prefix}/man
mkdir_p = $(install_sh) -d
oldincludedir = /usr/include
prefix = /usr/local/WordNet-3.0
program_transform_name = s,x,x,
sbindir = ${exec_prefix}/sbin
sharedstatedir = ${prefix}/com
sysconfdir = ${prefix}/etc
target_alias =
man_MANS = binsrch.3 cntlist.5 grind.1 lexnames.5 morph.3 morphy.7 senseidx.5 uniqbeg.7 wn.1 wnb.1 wndb.5 wngloss.7 wngroups.7 wninput.5 wnintro.1 wnintro.3 wnintro.5 wnintro.7 wnlicens.7 wnpkgs.7 wnsearch.3 wnstats.7 wnutil.3
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/man/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/man/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-man1: $(man1_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man1dir)" || $(mkdir_p) "$(DESTDIR)$(man1dir)"
@list='$(man1_MANS) $(dist_man1_MANS) $(nodist_man1_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.1*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
1*) ;; \
*) ext='1' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man1dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man1dir)/$$inst"; \
done
uninstall-man1:
@$(NORMAL_UNINSTALL)
@list='$(man1_MANS) $(dist_man1_MANS) $(nodist_man1_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.1*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
1*) ;; \
*) ext='1' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man1dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man1dir)/$$inst"; \
done
install-man3: $(man3_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man3dir)" || $(mkdir_p) "$(DESTDIR)$(man3dir)"
@list='$(man3_MANS) $(dist_man3_MANS) $(nodist_man3_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.3*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
3*) ;; \
*) ext='3' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man3dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man3dir)/$$inst"; \
done
uninstall-man3:
@$(NORMAL_UNINSTALL)
@list='$(man3_MANS) $(dist_man3_MANS) $(nodist_man3_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.3*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
3*) ;; \
*) ext='3' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man3dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man3dir)/$$inst"; \
done
install-man5: $(man5_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man5dir)" || $(mkdir_p) "$(DESTDIR)$(man5dir)"
@list='$(man5_MANS) $(dist_man5_MANS) $(nodist_man5_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.5*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
5*) ;; \
*) ext='5' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man5dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man5dir)/$$inst"; \
done
uninstall-man5:
@$(NORMAL_UNINSTALL)
@list='$(man5_MANS) $(dist_man5_MANS) $(nodist_man5_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.5*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
5*) ;; \
*) ext='5' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man5dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man5dir)/$$inst"; \
done
install-man7: $(man7_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man7dir)" || $(mkdir_p) "$(DESTDIR)$(man7dir)"
@list='$(man7_MANS) $(dist_man7_MANS) $(nodist_man7_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.7*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
7*) ;; \
*) ext='7' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man7dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man7dir)/$$inst"; \
done
uninstall-man7:
@$(NORMAL_UNINSTALL)
@list='$(man7_MANS) $(dist_man7_MANS) $(nodist_man7_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.7*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
7*) ;; \
*) ext='7' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man7dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man7dir)/$$inst"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(MANS)
installdirs:
for dir in "$(DESTDIR)$(man1dir)" "$(DESTDIR)$(man3dir)" "$(DESTDIR)$(man5dir)" "$(DESTDIR)$(man7dir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-man
install-exec-am:
install-info: install-info-am
install-man: install-man1 install-man3 install-man5 install-man7
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-info-am uninstall-man
uninstall-man: uninstall-man1 uninstall-man3 uninstall-man5 \
uninstall-man7
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-man1 install-man3 install-man5 install-man7 \
install-strip installcheck installcheck-am installdirs \
maintainer-clean maintainer-clean-generic mostlyclean \
mostlyclean-generic pdf pdf-am ps ps-am uninstall uninstall-am \
uninstall-info-am uninstall-man uninstall-man1 uninstall-man3 \
uninstall-man5 uninstall-man7
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1 +0,0 @@
man_MANS = binsrch.3 cntlist.5 grind.1 lexnames.5 morph.3 morphy.7 senseidx.5 uniqbeg.7 wn.1 wnb.1 wndb.5 wngloss.7 wngroups.7 wninput.5 wnintro.1 wnintro.3 wnintro.5 wnintro.7 wnlicens.7 wnpkgs.7 wnsearch.3 wnstats.7 wnutil.3

View File

@ -1,478 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# @configure_input@
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
@SET_MAKE@
srcdir = @srcdir@
top_srcdir = @top_srcdir@
VPATH = @srcdir@
pkgdatadir = $(datadir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = @INSTALL@
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/man
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
man1dir = $(mandir)/man1
am__installdirs = "$(DESTDIR)$(man1dir)" "$(DESTDIR)$(man3dir)" \
"$(DESTDIR)$(man5dir)" "$(DESTDIR)$(man7dir)"
man3dir = $(mandir)/man3
man5dir = $(mandir)/man5
man7dir = $(mandir)/man7
NROFF = nroff
MANS = $(man_MANS)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
AMDEP_FALSE = @AMDEP_FALSE@
AMDEP_TRUE = @AMDEP_TRUE@
AMTAR = @AMTAR@
AUTOCONF = @AUTOCONF@
AUTOHEADER = @AUTOHEADER@
AUTOMAKE = @AUTOMAKE@
AWK = @AWK@
CC = @CC@
CCDEPMODE = @CCDEPMODE@
CFLAGS = @CFLAGS@
CPP = @CPP@
CPPFLAGS = @CPPFLAGS@
CYGPATH_W = @CYGPATH_W@
DEFS = @DEFS@
DEPDIR = @DEPDIR@
ECHO_C = @ECHO_C@
ECHO_N = @ECHO_N@
ECHO_T = @ECHO_T@
EGREP = @EGREP@
EXEEXT = @EXEEXT@
INSTALL_DATA = @INSTALL_DATA@
INSTALL_PROGRAM = @INSTALL_PROGRAM@
INSTALL_SCRIPT = @INSTALL_SCRIPT@
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
LDFLAGS = @LDFLAGS@
LIBOBJS = @LIBOBJS@
LIBS = @LIBS@
LTLIBOBJS = @LTLIBOBJS@
MAKEINFO = @MAKEINFO@
OBJEXT = @OBJEXT@
PACKAGE = @PACKAGE@
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
PACKAGE_NAME = @PACKAGE_NAME@
PACKAGE_STRING = @PACKAGE_STRING@
PACKAGE_TARNAME = @PACKAGE_TARNAME@
PACKAGE_VERSION = @PACKAGE_VERSION@
PATH_SEPARATOR = @PATH_SEPARATOR@
RANLIB = @RANLIB@
SET_MAKE = @SET_MAKE@
SHELL = @SHELL@
STRIP = @STRIP@
TCL_INCLUDE_SPEC = @TCL_INCLUDE_SPEC@
TCL_LIB_SPEC = @TCL_LIB_SPEC@
TK_LIBS = @TK_LIBS@
TK_LIB_SPEC = @TK_LIB_SPEC@
TK_PREFIX = @TK_PREFIX@
TK_XINCLUDES = @TK_XINCLUDES@
VERSION = @VERSION@
ac_ct_CC = @ac_ct_CC@
ac_ct_RANLIB = @ac_ct_RANLIB@
ac_ct_STRIP = @ac_ct_STRIP@
ac_prefix = @ac_prefix@
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
am__include = @am__include@
am__leading_dot = @am__leading_dot@
am__quote = @am__quote@
am__tar = @am__tar@
am__untar = @am__untar@
bindir = @bindir@
build_alias = @build_alias@
datadir = @datadir@
exec_prefix = @exec_prefix@
host_alias = @host_alias@
includedir = @includedir@
infodir = @infodir@
install_sh = @install_sh@
libdir = @libdir@
libexecdir = @libexecdir@
localstatedir = @localstatedir@
mandir = @mandir@
mkdir_p = @mkdir_p@
oldincludedir = @oldincludedir@
prefix = @prefix@
program_transform_name = @program_transform_name@
sbindir = @sbindir@
sharedstatedir = @sharedstatedir@
sysconfdir = @sysconfdir@
target_alias = @target_alias@
man_MANS = binsrch.3 cntlist.5 grind.1 lexnames.5 morph.3 morphy.7 senseidx.5 uniqbeg.7 wn.1 wnb.1 wndb.5 wngloss.7 wngroups.7 wninput.5 wnintro.1 wnintro.3 wnintro.5 wnintro.7 wnlicens.7 wnpkgs.7 wnsearch.3 wnstats.7 wnutil.3
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/man/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/man/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-man1: $(man1_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man1dir)" || $(mkdir_p) "$(DESTDIR)$(man1dir)"
@list='$(man1_MANS) $(dist_man1_MANS) $(nodist_man1_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.1*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
1*) ;; \
*) ext='1' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man1dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man1dir)/$$inst"; \
done
uninstall-man1:
@$(NORMAL_UNINSTALL)
@list='$(man1_MANS) $(dist_man1_MANS) $(nodist_man1_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.1*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
1*) ;; \
*) ext='1' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man1dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man1dir)/$$inst"; \
done
install-man3: $(man3_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man3dir)" || $(mkdir_p) "$(DESTDIR)$(man3dir)"
@list='$(man3_MANS) $(dist_man3_MANS) $(nodist_man3_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.3*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
3*) ;; \
*) ext='3' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man3dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man3dir)/$$inst"; \
done
uninstall-man3:
@$(NORMAL_UNINSTALL)
@list='$(man3_MANS) $(dist_man3_MANS) $(nodist_man3_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.3*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
3*) ;; \
*) ext='3' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man3dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man3dir)/$$inst"; \
done
install-man5: $(man5_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man5dir)" || $(mkdir_p) "$(DESTDIR)$(man5dir)"
@list='$(man5_MANS) $(dist_man5_MANS) $(nodist_man5_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.5*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
5*) ;; \
*) ext='5' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man5dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man5dir)/$$inst"; \
done
uninstall-man5:
@$(NORMAL_UNINSTALL)
@list='$(man5_MANS) $(dist_man5_MANS) $(nodist_man5_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.5*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
5*) ;; \
*) ext='5' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man5dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man5dir)/$$inst"; \
done
install-man7: $(man7_MANS) $(man_MANS)
@$(NORMAL_INSTALL)
test -z "$(man7dir)" || $(mkdir_p) "$(DESTDIR)$(man7dir)"
@list='$(man7_MANS) $(dist_man7_MANS) $(nodist_man7_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.7*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
if test -f $(srcdir)/$$i; then file=$(srcdir)/$$i; \
else file=$$i; fi; \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
7*) ;; \
*) ext='7' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " $(INSTALL_DATA) '$$file' '$(DESTDIR)$(man7dir)/$$inst'"; \
$(INSTALL_DATA) "$$file" "$(DESTDIR)$(man7dir)/$$inst"; \
done
uninstall-man7:
@$(NORMAL_UNINSTALL)
@list='$(man7_MANS) $(dist_man7_MANS) $(nodist_man7_MANS)'; \
l2='$(man_MANS) $(dist_man_MANS) $(nodist_man_MANS)'; \
for i in $$l2; do \
case "$$i" in \
*.7*) list="$$list $$i" ;; \
esac; \
done; \
for i in $$list; do \
ext=`echo $$i | sed -e 's/^.*\\.//'`; \
case "$$ext" in \
7*) ;; \
*) ext='7' ;; \
esac; \
inst=`echo $$i | sed -e 's/\\.[0-9a-z]*$$//'`; \
inst=`echo $$inst | sed -e 's/^.*\///'`; \
inst=`echo $$inst | sed '$(transform)'`.$$ext; \
echo " rm -f '$(DESTDIR)$(man7dir)/$$inst'"; \
rm -f "$(DESTDIR)$(man7dir)/$$inst"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(MANS)
installdirs:
for dir in "$(DESTDIR)$(man1dir)" "$(DESTDIR)$(man3dir)" "$(DESTDIR)$(man5dir)" "$(DESTDIR)$(man7dir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-man
install-exec-am:
install-info: install-info-am
install-man: install-man1 install-man3 install-man5 install-man7
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-info-am uninstall-man
uninstall-man: uninstall-man1 uninstall-man3 uninstall-man5 \
uninstall-man7
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-man1 install-man3 install-man5 install-man7 \
install-strip installcheck installcheck-am installdirs \
maintainer-clean maintainer-clean-generic mostlyclean \
mostlyclean-generic pdf pdf-am ps ps-am uninstall uninstall-am \
uninstall-info-am uninstall-man uninstall-man1 uninstall-man3 \
uninstall-man5 uninstall-man7
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1,66 +0,0 @@
'\" t
.\" $Id$
.TH BINSRCH 3WN "Dec 2006" "WordNet 3.0" "WordNet\(tm Library Functions"
.SH NAME
bin_search, copyfile, replace_line, insert_line
.SH SYNOPSIS
.LP
\fBchar *bin_search(char *key, FILE *fp);\fP
.LP
\fBvoid copyfile(FILE *fromfp, FILE *tofp);\fP
.LP
\fBchar *replace_line(char *new_line, char *key, FILE *fp);\fP
.LP
\fBchar *insert_line(char *new_line, char *key, FILE *fp);\fP
.SH DESCRIPTION
.LP
The WordNet library contains several general purpose functions for
performing a binary search and modifying sorted files.
.LP
.B bin_search(\|)
is the primary binary search algorithm to search for \fIkey\fP as the
first item on a line in the file pointed to by \fIfp\fP. The
delimiter between the key and the rest of the fields on the line, if
any, must be a space. A pointer to a static variable containing the
entire line is returned.
.SB NULL
is returned if a match is not found.
.LP
The remaining functions are not used by WordNet, and are only briefly
described.
.LP
.B copyfile(\|)
copies the contents of one file to another.
.LP
.B replace_line(\|)
replaces a line in a file having searchkey \fIkey\fP
with the contents of \fInew_line\fP.
It returns the original line or
.SB NULL
in case of error.
.LP
.B insert_line(\|)
finds the proper place to insert the contents of \fInew_line\fP,
having searchkey \fIkey\fP in the sorted file pointed to by \fIfp\fP.
It returns
.SB NULL
if a line with this searchkey is already in the file.
.SH NOTES
The maximum length of \fIkey\fP is 1024.
The maximum line length in a file is 25K.
If there are no additional fields after the search key, the key must
be followed by at least one space before the newline character.
.SH SEE ALSO
.BR wnintro (3WN),
.BR morph (3WN),
.BR wnsearch (3WN),
.BR wnutil (3WN),
.BR wnintro (5WN).
.SH WARNINGS
\fBbinsearch(\|)\fP returns a pointer to a static character buffer.
The returned string should be copied by the caller if the results need
to be saved, as a subsequent call will replace the contents of the
static buffer.

View File

@ -1,92 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH CNTLIST 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
cntlist \- file listing number of times each tagged sense occurs in a
semantic concordance, sorted most to least frequently tagged
cntlist.rev \- file listing number of times each tagged sense occurs
in a semantic concordance, sorted by sense key
.SH DESCRIPTION
A cntlist file for a semantic concordance lists the number of times
each semantically tagged sense occurs in the concordance and its
sense number in the WordNet database. Each line in the file
corresponds to a sense in the WordNet database to which at least one
semantic tag points. Only senses that are tagged in a concordance are
in the concordance's cntlist file.
.SS WordNet Database \fIcntlist\fP File
In the WordNet database, words are assigned sense numbers based on
frequency of use in semantically tagged corpora. The cntlist file used
by
.BR grind (1WN)
to build the WordNet database and assign the sense numbers is a union
of the cntlist files from the various semantic concordances that were
formerly released by Princeton University. This
combined cntlist file is provided with the WordNet package and is
found in the \fBWNSEARCHDIR\fP directory.
The \fIcntlist.rev\fP file is used at run-time by the WordNet
library code and browser interfaces to print in the output display the
number of times each sense has been tagged.
.SS File Format
Each line in a cntlist file contains information for one sense. The
file is ordered from most to least frequently tagged sense. The
fields are separated by one space, and each line is terminated with a
newline character. Senses having the same \fItag_cnt\fP value are
listed in reverse alphabetical order of the \fIlemma\fP field of the
\fIsense_key\fP.
Each line in \fBcntlist\fP is of the form:
.RS
\fItag_cnt~~sense_key~~sense_number\fP
.RE
where \fItag_cnt\fP is the decimal number of times the sense is tagged
in the corresponding semantic concordance. \fIsense_key\fP is a
WordNet sense encoding and \fIsense_number\fP is a WordNet sense
number as described in
The \fIcntlist.rev\fP file contains the same fields described above,
in the following order:
.RS
\fIsense_key~~sense_number~~tag_cnt\fP
.RE
.SH NOTES
Princeton no longer maintains or releases the Semantic Concordance
files. The \fIcntlist\fP file used to order the senses in WordNet
3.0 was generated from the Semantic Concordance files at the point
that they were last updated in 2001. In general, the order of senses
presented usually reflects what the user would expect, however sense
ordering is now less reliable than in prior releases and should not be
construed as an accurate indicator of frequency of use.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.TP 20
.B HKEY_CURRENT_USER\eSOFTWARE\eWordNet\e3.0\ewnres
User's default browser options.
.SH FILES
.TP 20
.B cntlist, cntlist.rev
file of combined semantic concordance \fBcntlist\fP files. Used to
assign sense numbers in WordNet database
.SH SEE ALSO
.BR grind (1WN),
.BR wnintro (5WN),
.BR senseidx (5WN).

View File

@ -1,161 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH GRIND 1 "Dec 2006" "WordNet 3.0" "WordNet\(tm User Commands"
.SH NAME
grind \- process WordNet lexicographer files
.SH SYNOPSIS
\fBgrind\fP [ \fB\-v\fP ] [ \fB\-s\fP ] [ \fB\-L\fP\fIlogfile\fP ] [ \fB\-a\fP ] [ \fB\-d\fP ] [ \fB\-i\fP ] [ \fB\-o\fP ] [ \fB\-n\fP ] \fIfilename\fP [ \fIfilename\fP\&.\|.\|. ]
.SH DESCRIPTION
\fBgrind(\|)\fP processes WordNet lexicographer files, producing
database files suitable for use with the WordNet search and interface
code and other applications. The syntactic and structural integrity
of the input files is verified. Warnings and errors are reported via
\fBstderr\fP and a run-time log is produced on \fBstdout\fP. A
database is generated only if there are no errors.
.SS Input Files
Input files correspond to the syntactic categories implemented in
WordNet \-
.BR noun ", "
.BR verb ", "
.BR adjective " and "
.BR adverb .
Each input lexicographer file consists of a list of synonym sets
(\fIsynsets\fP) for one part of speech. Although the basic synset
syntax is the same for all of the parts of speech, some parts of the
syntax only apply to a particular part of speech. See
.BR wninput (5WN)
for a description of the input file format.
Each \fIfilename\fP specified is of the form:
.RS
.IB pathname / pos . suffix
.RE
where \fIpathname\fP is optional and \fIpos\fP is either
.BR noun ", "
.BR verb ", "
.BR adj " or "
.BR adv .
\fIsuffix\fP may be used to separate groups of synsets into different
files, for example \fBnoun.animal\fP and \fBnoun.plant\fP. One or
more input files, in any combination of syntactic categories, may be
specified. See
.BR lexnames (5WN)
for a list of the lexicographer files used to build the complete
WordNet database.
.SS Output Files
\fBgrind(\|)\fP produces the following output files:
.TS
center box ;
c | c
l | l.
\fBFilename Description\fP
_
\fBindex.\fIpos\fR Index file for each syntactic category
\fBdata.\fIpos\fR Data file for each syntactic category
\fBindex.sense\fP Sense index
.TE
See
.BR wndb (5WN)
for a description of the database file formats.
Each time \fBgrind(\|)\fP is run, any existing database files are
overwritten with the database files generated from the specified input
files. If no input files from a syntactic category are specified,
the corresponding database files are not overwritten.
.SS Sense Numbers
Senses are generally ordered from most to least frequently used, with
the most common sense numbered \fB1\fP. Frequency of use is
determined by the number of times a sense is tagged in the various
semantic concordance texts. Senses that are not semantically tagged
follow the ordered senses in an arbitrary order.
Note that this ordering is only an
estimate based on usage in a small corpus.
The \fItagsense_cnt\fP field for each
entry in the \fBindex.\fIpos\fR files indicates how many of the senses
in the list have been tagged.
The \fBcntlist\fP file provided with the database lists the number of
times each sense is tagged in the semantic concordances.
\fBgrind(\|)\fP uses the data from \fBcntlist\fP to order the senses
of each word. When the \fBindex\fP.\fIpos\fP files are generated, the
\fIsynset_offset\fPs are output in sense number order, with sense 1
first in the list. Senses with the same number of semantic tags are
assigned unique but consecutive sense numbers. The WordNet
.SB OVERVIEW
search displays all senses of the specified word, in all syntactic
categories, and indicates which of the senses are represented in the
semantically tagged texts.
.SH OPTIONS
.TP 15
.B \-v
Verify integrity of input without generating database.
.TP 15
.B \-s
Suppress generation of warning messages. Usually \fBgrind\fP is run
with this option until all syntactic and structural errors are corrected
since the warning messages may make it difficult to spot error
messages.
.TP 15
.BI \-L logfile
Write all messages to \fIlogfile\fP instead of \fBstderr\fP.
.TP 15
.B \-a
Generate statistical report on input files processed.
.TP 15
.B \-d
Generate distribution of senses by string length report on input files
processed.
.TP 15
.B \-i
Generate sense index file.
.TP 15
.B \-o
Order senses using \fBcntlist\fP.
.TP 15
.B \-n
Generate nominalization (derivational morphology) links in database.
.TP 15
.I filename
Input file of the form described in
.SB Input Files.
.SH FILES
.TP 20
.B \fIpos\fP.*
lexicographer files to use to build database
.TP 20
.B cntlist
file of combined semantic concordance \fBcntlist\fP files. Used to
assign sense numbers in WordNet database
.SH SEE ALSO
.BR cntlist (5WN),
.BR lexnames (5WN),
.BR senseidx (5WN),
.BR wndb (5WN),
.BR wninput (5WN),
.BR uniqbeg (7WN),
.BR wngloss (7WN).
.SH DIAGNOSTICS
Exit status is normally 0.
Exit status is -1 if non-specific error occurs.
If syntactic or structural errors exist, exit status is number of
errors detected.
.TP
.B "usage: grind [\-v] [\-s] [\-Llogfile] [\-a ] [\-d] [\-i] [\-o] [\-n] filename [filename...]"
Invalid options were specified on the command line.
.TP
.B No input files processed.
None of the filenames specified were of the appropriate form.
.TP
.B \fIn\fP syntactic errors found.
Syntax errors were found while parsing the input files.
.TP
.B \fIn\fP structural errors found.
Pointer errors were found that could not be automatically corrected.
.SH BUGS
Please report bugs to \fBwordnet@princeton.edu\fP.

View File

@ -1,123 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH LEXNAMES 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
List of WordNet lexicographer file names and numbers
.SH DESCRIPTION
During WordNet development synsets are organized into forty-five
lexicographer files based on syntactic category and logical groupings.
.BR grind (1WN)
processes these files and produces a database suitable for use with
the WordNet library, interface code, and other applications. The
format of the lexicographer files is described in
.BR wninput (5WN).
A file number corresponds to each lexicographer file. File numbers
are encoded in several parts of the WordNet system as an efficient way
to indicate a lexicographer file name. The file \fBlexnames\fP lists
the mapping between file names and numbers, and can be used by
programs or end users to correlate the two.
.SS File Format
Each line in \fBlexnames\fP contains 3 tab separated fields, and is
terminated with a newline character. The first field is the two digit
decimal integer file number. (The first file in the list is numbered
\fB00\fP.) The second field is the name of the lexicographer file that
is represented by that number, and the third field is an integer that
indicates the syntactic category of the synsets contained in the file.
This is simply a shortcut for programs and scripts, since the
syntactic category is also part of the lexicographer file's name.
.SS Syntactic Category
The syntactic category field is encoded as follows:
.RS
.nf
\fB1\fP NOUN
\fB2\fP VERB
\fB3\fP ADJECTIVE
\fB4\fP ADVERB
.fi
.RE
.SS Lexicographer Files
The names of the lexicographer files and their corresponding file
numbers are listed below along with a brief description each file's
contents.
.RS
.TS
center ;
l l l.
\fBFile Number\fP \fBName\fP \fBContents\fP
_
00 adj.all all adjective clusters
01 adj.pert relational adjectives (pertainyms)
02 adv.all all adverbs
03 noun.Tops unique beginner for nouns
04 noun.act nouns denoting acts or actions
05 noun.animal nouns denoting animals
06 noun.artifact nouns denoting man-made objects
07 noun.attribute nouns denoting attributes of people and objects
08 noun.body nouns denoting body parts
09 noun.cognition nouns denoting cognitive processes and contents
10 noun.communication nouns denoting communicative processes and contents
11 noun.event nouns denoting natural events
12 noun.feeling nouns denoting feelings and emotions
13 noun.food nouns denoting foods and drinks
14 noun.group nouns denoting groupings of people or objects
15 noun.location nouns denoting spatial position
16 noun.motive nouns denoting goals
17 noun.object nouns denoting natural objects (not man-made)
18 noun.person nouns denoting people
19 noun.phenomenon nouns denoting natural phenomena
20 noun.plant nouns denoting plants
21 noun.possession nouns denoting possession and transfer of possession
22 noun.process nouns denoting natural processes
23 noun.quantity nouns denoting quantities and units of measure
24 noun.relation nouns denoting relations between people or things or ideas
25 noun.shape nouns denoting two and three dimensional shapes
26 noun.state nouns denoting stable states of affairs
27 noun.substance nouns denoting substances
28 noun.time nouns denoting time and temporal relations
29 verb.body verbs of grooming, dressing and bodily care
30 verb.change verbs of size, temperature change, intensifying, etc.
31 verb.cognition verbs of thinking, judging, analyzing, doubting
32 verb.communication verbs of telling, asking, ordering, singing
33 verb.competition verbs of fighting, athletic activities
34 verb.consumption verbs of eating and drinking
35 verb.contact verbs of touching, hitting, tying, digging
36 verb.creation verbs of sewing, baking, painting, performing
37 verb.emotion verbs of feeling
38 verb.motion verbs of walking, flying, swimming
39 verb.perception verbs of seeing, hearing, feeling
40 verb.possession verbs of buying, selling, owning
41 verb.social verbs of political and social activities and events
42 verb.stative verbs of being, having, spatial relations
43 verb.weather verbs of raining, snowing, thawing, thundering
44 adj.ppl participial adjectives
.TE
.RE
.SH NOTES
The lexicographer files are not included in the WordNet database package.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B lexnames
list of lexicographer file names and numbers
.SH SEE ALSO
.BR grind (1WN),
.BR wnintro (5WN),
.BR wndb (5WN),
.BR wninput (5WN).

View File

@ -1,110 +0,0 @@
'\" t
.\" $Id$
.TH MORPH 3WN "Dec 2006" "WordNet 3.0" "WordNet\(tm Library Functions"
.SH NAME
morphinit, re_morphinit, morphstr, morphword
.SH SYNOPSIS
.LP
\fB#include "wn.h"\fP
.LP
\fBint morphinit(void);\fP
.LP
\fBint re_morphinit(void);\fP
.LP
\fBchar *morphstr(char *origstr, int pos);\fP
.LP
\fBchar *morphword(char *word, int pos);\fP
.SH DESCRIPTION
.LP
The WordNet morphological processor, Morphy, is accessed through these
functions:
.LP
.B morphinit(\|)
is used to open the exception list files. It returns \fB0\fP if
successful, \fB-1\fP otherwise. The exception list files must be
opened before
.B morphstr(\|)
or
.B morphword(\)
are called.
.LP
.B re_morphinit(\|)
is used to close the exception list files and reopen them, and is used
exclusively for WordNet development. Return codes are as described
above.
.LP
.B morphstr(\|)
is the basic user interface to Morphy. It tries to find the base form
(lemma) of the word or collocation \fIorigstr\fP in the specified
\fIpos\fP. The first call (with \fIorigstr\fP specified) returns a
pointer to the first base form found. Subsequent calls requesting
base forms of the same string must be made with the first argument of
.SB NULL.
When no more base forms for \fIorigstr\fP can be found,
.SB NULL
is returned. Note that \fBmorphstr()\fP returns a pointer to a static
character buffer. A subsequent call to \fBmorphstr()\fP with a new
string (instead of \fBNULL\fP) will overwrite the string pointed to by
a previous call. Users should copy the returned string into a local
buffer, or use the C library function \fBstrdup\fP to duplicate the
returned string into a \fImalloc'd\fP buffer.
.LP
.B morphword(\|)
tries to find the base form of \fIword\fP in the specified \fIpos\fP.
This function is called by
.B morphstr(\|)
for each individual word in a collocation.
Note that \fBmorphword()\fP returns a pointer to a static
character buffer. A subsequent call to \fBmorphword()\fP
will overwrite the string pointed to by
a previous call. Users should copy the returned string into a local
buffer, or use the C library function \fBstrdup\fP to duplicate the
returned string into a \fImalloc'd\fP buffer.
.SH NOTES
.B morphinit(\|)
is called by
.B wninit(\|)
and is not intended to be called directly by an application.
Applications wishing to use WordNet and/or the morphological functions
must call \fBwninit(\|)\fP at the start of the program. See
.BR wnutil (3WN)
for more information.
\fIorigstr\fP may be either a word or a collocation formed by joining
individual words with underscore characters (\fB_\fP).
Usually only \fBmorphstr(\|)\fP is called from applications, as it
works on both words and collocations.
\fIpos\fP must be one of the following:
.RS
.nf
\fB1\fP NOUN
\fB2\fP VERB
\fB3\fP ADJECTIVE
\fB4\fP ADVERB
\fB5\fP ADJECTIVE_SATELLITE
.fi
.RE
If
.SB ADJECTIVE_SATELLITE
is passed, it is treated by \fBmorphstr(\|)\fP as
.SB ADJECTIVE.
.SH SEE ALSO
.BR wnintro (3WN),
.BR wnsearch (3WN),
.BR wndb (5WN),
.BR morphy (7WN).
.SH WARNINGS
Passing an invalid part of speech will result in a core dump.
The WordNet database files must be open to use \fBmorphstr(\|)\fP or
\fBmorphword(\|).
.SH BUGS
Morphy will allow non-words to be converted to words, if they follow
one of the rules described above. For example, it will happily
convert \fBplantes\fP to \fBplants\fP.

View File

@ -1,180 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH MORPHY 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
morphy \- discussion of WordNet's morphological processing
.SH DESCRIPTION
Although only base forms of words are usually stored in WordNet,
searches may be done on inflected forms. A set of morphology
functions, Morphy, is applied to the search string to generate a form
that is present in WordNet.
Morphology in WordNet uses two types of processes to try to convert
the string passed into one that can be found in the WordNet database.
There are lists of inflectional endings, based on syntactic category,
that can be detached from individual words in an attempt to find a
form of the word that is in WordNet. There are also exception list
files, one for each syntactic category, in which a search for an
inflected form is done. Morphy tries to use these two processes in an
intelligent manner to translate the string passed to the base form
found in WordNet. Morphy first checks for exceptions, then uses the
rules of detachment. The Morphy functions are not independent from
WordNet. After each transformation, WordNet is searched for the
resulting string in the syntactic category specified.
The Morphy functions are passed a string and a syntactic category. A
string is either a single word or a collocation. Since some words,
such as \fBaxes\fP can have more than one base form (\fBaxe\fP and
\fBaxis\fP), Morphy works in the following manner. The first time
that Morphy is called with a specific string, it returns a base form.
For each subsequent call to Morphy made with a
.SB NULL
string argument, Morphy returns another base form. Whenever Morphy
cannot perform a transformation, whether on the first call for a word
or subsequent calls,
.SB NULL
is returned. A transformation to a valid English string will return
.SB NULL
if the base form of the string is not in WordNet.
The morphological functions are found in the WordNet library. See
.BR morph (3WN)
for information on using these functions.
.SS Rules of Detachment
The following table shows the rules of detachment used by Morphy. If
a word ends with one of the suffixes, it is stripped from the word and
the corresponding ending is added. Then WordNet is searched for the
resulting string. No rules are applicable to adverbs.
.TS
center, tab(+) ;
c | c | c
l | l | l.
\fBPOS\fP+\fBSuffix\fP+\fBEnding\fP
_
NOUN+"s"+""
NOUN+"ses"+"s"
NOUN+"xes"+"x"
NOUN+"zes"+"z"
NOUN+"ches"+"ch"
NOUN+"shes"+"sh"
NOUN+"men"+"man"
NOUN+"ies"+"y"
VERB+"s"+""
VERB+"ies"+"y"
VERB+"es"+"e"
VERB+"es"+""
VERB+"ed"+"e"
VERB+"ed"+""
VERB+"ing"+"e"
VERB+"ing"+""
ADJ+"er"+""
ADJ+"est"+""
ADJ+"er"+"e"
ADJ+"est"+"e"
.TE
.SS Exception Lists
There is one exception list file for each syntactic category. The
exception lists contain the morphological transformations for strings
that are not regular and therefore cannot be processed in an
algorithmic manner. Each line of an exception list contains an
inflected form of a word or collocation, followed by one or more base
forms. The list is kept in alphabetical order and a binary search is
used to find words in these lists. See
.BR wndb (5WN)
for information on the format of the exception list files.
.SS Single Words
In general, single words are relatively easy to process. Morphy first
looks for the word in the exception list. If it is found the first
base form is returned. Subsequent calls with a
.SB NULL
argument return additional base forms, if present. A
.SB NULL
is returned when there are no more base forms of the word.
If the word is not found in the exception list corresponding to the
syntactic category, an algorithmic process using the rules of
detachment looks for a matching suffix. If a matching suffix is
found, a corresponding ending is applied (sometimes this ending is a
.SB NULL
string, so in effect the suffix is removed from the word), and WordNet
is consulted to see if the resulting word is found in the desired part
of speech.
.SS Collocations
As opposed to single words, collocations can be quite difficult to
transform into a base form that is present in WordNet. In general,
only base forms of words, even those comprising collocations, are
stored in WordNet, such as \fBattorney~general\fP. Transforming the
collocation \fBattorneys~general\fP is then simply a matter of finding
the base forms of the individual words comprising the collocation.
This usually works for nouns, therefore non-conforming nouns, such as
\fBcustoms~duty\fP are presently entered in the noun exception list.
Verb collocations that contain prepositions, such as \fBask~for~it\fP,
are more difficult. As with single words, the exception list is
searched first. If the collocation is not found, special code in
Morphy determines whether a verb collocation includes a preposition.
If it does, a function is called to try to find the base form in the
following manner. It is assumed that the first word in the
collocation is a verb and that the last word is a noun. The algorithm
then builds a search string with the base forms of the verb and noun,
leaving the remainder of the collocation (usually just the
preposition, but more words may be involved) in the middle. For
example, passed \fBasking~for~it\fP, the database search would be
performed with \fBask~for~it\fP, which is found in WordNet, and
therefore returned from Morphy. If a verb collocation does not
contain a preposition, then the base form of each word in the
collocation is found and WordNet is searched for the resulting string.
.SS Hyphenation
Hyphenation also presents special difficulties when searching WordNet.
It is often a subjective decision as to whether a word is hyphenated,
joined as one word, or is a collocation of several words, and which of
the various forms are entered into WordNet. When Morphy breaks a
string into "words", it looks for both spaces and hyphens as
delimiters. It also looks for periods in strings and removes them if
an exact match is not found. A search for an abbreviation like
\fBoct.\fP return the synset for \fB{~October,~Oct~}\fP. Not every
pattern of hyphenated and collocated string is searched for properly,
so it may be advantageous to specify several search strings if the
results of a search attempt seem incomplete.
.SS Special Processing for nouns ending with 'ful'
Morphy contains code that searches for nouns ending with \fBful\fP
and performs a transformation on the substring preceeding it. It then
appends 'ful' back onto the resulting string and returns it. For
example, if passed the nouns \fBboxesful\fP, it will return \fBboxful\fP.
.SH BUGS
Since many noun collocations contains prepositions, such as
\fBline~of~products\fP, an algorithm similar to that used for verbs
should be written for nouns. In the present scheme, if Morphy is
passed \fBlines~of~products\fP, the search string becomes
\fBline~of~product\fP, which is not in WordNet
Morphy will allow non-words to be converted to words, if they follow
one of the rules described above. For example, it will happily
convert \fBplantes\fP to \fBplants\fP.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B \fIpos\fP.exc
morphology exception lists
.SH SEE ALSO
.BR wn (1WN),
.BR wnb (1WN),
.BR binsrch (3WN),
.BR morph (3WN),
.BR wndb (5WN),
.BR wninput (7WN).

View File

@ -1,344 +0,0 @@
.\" $Id$
.tr ~
.TH PROLOGDB 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
wn_\*.pl \- description of Prolog database files
.SH DESCRIPTION
The files \fBwn_\fP\fI*\fP\fB.pl\fP contain the WordNet database in a
prolog-readable format. A prolog interface to WordNet is not
implemented.
The prolog database is very large and may take many minutes to load
into the Prolog workspace. A separate file has been created for each
WordNet relation giving the user the ability to load only those parts
of the database that they are interested.
See \fBFILES\fP, below, for a list of the database files and
.BR wndb (5WN)
and
.BR wninput (5WN)
for detailed descriptions of the various WordNet relations (referred to
as \fIoperators\fP in this manual page).
.SS File Format
Each prolog database file contains information corresponding to the
synsets and word senses contained in the WordNet database. In the
prolog version of the database, the \fIsynset_id\fPs (defined below)
are used as unique synset identifiers.
Each line of a file contains an operator that corresponds to a WordNet
relation. All lines with the same \fIoperator\fP value are stored in
the file \fBwn_\fP\fIoperator\fP\fB.pl\fP.
The general format of a line in a prolog database file is as follows:
.RS
.nf
\fIoperator\fB(\fIfield1\fB,\fI~~...~~\fB,\fIfieldn\fB).\fR
.fi
.RE
Each line contains the name of the operator, followed by a left
parenthesis, a comma-separated list of fields, a right parenthesis,
and a period. Note there are no spaces, and each line is terminated
with a newline character.
.SS Operators
Each WordNet relation is represented in a separate file by
\fIoperator\fP name. Some operators are reflexive (i.e. the "reverse"
relation is implicit). So, for example, if \fBx\fP is a hypernym of
\fBy\fP, \fBy\fP is necessarily a hyponym of \fBx\fP. In the prolog
database, reflected pointers are usually implied for semantic
relations.
Semantic relations are represented by a pair of \fIsynset_id\fPs, in
which the first \fIsynset_id\fP is generally the source of the
relation and the second is the target. If two pairs
\fIsynset_id\fP\fB,\fP\fIw_num\fP are present, the operator represents
a lexical relation between word forms.
.nf
\fBs(\fIsynset_id\fB,\fIw_num\fB,'\fIword\fB',\fIss_type\fB,\fIsense_number\fB,\fItag_count\fB).
.fi
.RS
A \fBs\fP operator is present for every word sense in WordNet. In
\fBwn_s.pl\fP, \fIw_num\fP specifies the word number for \fIword\fP in
the synset.
.RE
.nf
\fBg(\fIsynset_id\fB,'(\fIgloss\fB)').
.fi
.RS
The \fBg\fP operator specifies the gloss for a synset.
.RE
.nf
\fBhyp(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBhyp\fP operator specifies that the second synset is a
hypernym of the first synset. This relation holds for nouns and
verbs. The reflexive operator, hyponym, implies that the first
synset is a hyponym of the second synset.
.RE
.nf
\fBent(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBent\fP operator specifies that the second synset is
an entailment of first synset. This relation only holds for verbs.
.RE
.nf
\fBsim(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBsim\fP operator specifies that the second synset is similar in
meaning to the first synset. This means that the second synset is a
satellite the first synset, which is the cluster head. This relation
only holds for adjective synsets contained in adjective clusters.
.RE
.nf
\fBmm(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBmm\fP operator specifies that the second synset is a
member meronym of the first synset. This relation only holds for
nouns. The reflexive operator, member holonym, can be implied.
.RE
.nf
\fBms(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBms\fP operator specifies that the second synset is a
substance meronym of the first synset. This relation only holds for
nouns. The reflexive operator, substance holonym, can be implied.
.RE
.nf
\fBmp(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBmp\fP operator specifies that the second synset is a
part meronym of the first synset. This relation only holds for
nouns. The reflexive operator, part holonym, can be implied.
.RE
.nf
\fBcs(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBcs\fP operator specifies that the second synset is a cause
of the first synset. This relation only holds for verbs.
.RE
.nf
\fBvgp(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBvgp\fP operator specifies verb synsets that are similar in
meaning and should be grouped together when displayed in response to a
grouped synset search.
.RE
.nf
\fBat(\fIsynset_id\fB,\fIsynset_id\fB).
.fi
.RS
The \fBat\fP operator defines the attribute relation between noun and
adjective synset pairs in which the adjective is a value of the noun.
For each pair, both relations are listed (ie. each \fIsynset_id\fP is
both a source and target).
.RE
.nf
\fBant(\fIsynset_id\fB,\fIw_num\fB,\fIsynset_id\fB,\fIw_num\fB).
.fi
.RS
The \fBant\fP operator specifies antonymous \fIword\fPs. This is a
lexical relation that holds for all syntactic categories. For each
antonymous pair, both relations are listed (ie. each
\fIsynset_id,w_num\fP pair is both a source and target word.)
.RE
.nf
\fBsa(\fIsynset_id\fB,\fIw_num\fB,\fIsynset_id\fB,\fIw_num\fB).
.fi
.RS
The \fBsa\fP operator specifies that additional information about the
first word can be obtained by seeing the second word. This
operator is only defined for verbs and adjectives. There is no reflexive
relation (ie. it cannot be inferred that the additional information
about the second word can be obtained from the first word).
.RE
.nf
\fBppl(\fIsynset_id\fB,\fIw_num\fB,\fIsynset_id\fB,\fIw_num\fB).
.fi
.RS
The \fBppl\fP operator specifies that the adjective first word is a
participle of the verb second word. The reflexive operator can be
implied.
.RE
.nf
\fBper(\fIsynset_id\fB,\fIw_num\fB,\fIsynset_id\fB,\fIw_num\fB).
.fi
.RS
The \fBper\fP operator specifies two different relations based on the
parts of speech involved. If the first word is in an adjective
synset, that word pertains to either the noun or adjective second
word. If the first word is in an adverb synset, that word is derived
from the adjective second word.
.RE
.nf
\fBfr(\fIsynset_id\fB,\fIf_num\fB,\fIw_num\fB).
.fi
.RS
The \fBfr\fP operator specifies a generic sentence frame for one or
all words in a synset. The operator is defined only for verbs.
.RE
.SS Field Definitions
A \fIsynset_id\fP is a nine byte field in which the first
byte defines the syntactic category of the synset and the remaining
eight bytes are a \fIsynset_offset\fP, as defined in
.BR wndb (5WN),
indicating the byte offset in the \fBdata.\fP\fIpos\fP file that
corresponds to the syntactic category.
The syntactic category is encoded as:
.RS
.nf
\fB1\fP NOUN
\fB2\fP VERB
\fB3\fP ADJECTIVE
\fB4\fP ADVERB
.fi
.RE
\fIw_num\fP, if present, indicates which word in the synset is being
referred to. Word numbers are assigned to the \fIword\fP fields in a
synset, from left to right, beginning with 1. When used to represent
lexical WordNet relations \fIw_num\fP may be 0, indicating that the
relation holds for all words in the synset indicated by the preceding
\fIsynset_id\fP. See
.BR wninput (5WN)
for a discussion of semantic and lexical relations.
\fIss_type\fP is a one character code indicating the synset type:
.RS
.nf
\fBn\fP NOUN
\fBv\fP VERB
\fBa\fP ADJECTIVE
\fBs\fP ADJECTIVE~SATELLITE
\fBr\fP ADVERB
.fi
.RE
\fIsense_number\fP specifies the sense number of the word, within the
part of speech encoded in the \fIsynset_id\fP, in the WordNet
database.
\fIword\fP is the ASCII text of the word as entered in the synset by
the lexicographer, with spaces replaced by underscore characters
(\fB_\fP). The text of the word is case sensitive. An adjective
\fIword\fP is immediately followed by a syntactic marker if one was
specified in the lexicographer file. A syntactic marker is appended,
in parentheses, onto \fIword\fP without any intervening spaces. See
.BR wninput (5WN)
for a list of the syntactic markers for adjectives.
Each synset has a \fIgloss\fP that may contain a definition, one or
more example sentences, or both. Note that glosses are enclosed in
single forward quotes and parentheses:~~\fB'(\fIgloss\fB)'\fR.
\fIf_num\fP specifies the generic sentence frame number for word
\fIw_num\fP in the synset indicated by \fIsynset_id\fP. Note that
when \fIw_num\fP is \fB0\fP, the frame number applies to all words in
the synset. If non-zero, the frame applies to that word in the
synset.
In WordNet, sense numbers are assigned as described in
.BR wndb (5WN).
\fItag_count\fP is the number of times the sense was tagged in the
Semantic Concordances, and \fB0\fP if it was not instantiated.
.SH NOTES
Since single forward quotes are used to enclose character strings,
single quote characters found in \fIword\fP and \fIgloss\fP fields are
represented as two adjacent single quote characters.
The load time can be greatly reduced by creating "object language"
versions of the files, an option that is supported by some
implementations, such as Quintus Prolog.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
All files are in \fBWNHOME/prolog\fP on Unix platforms and
\fBWNHome\eprolog\fP on Windows platforms
.TP 20
.B wn_s.pl
synset pointers
.TP 20
.B wn_g.pl
gloss pointers
.TP 20
.B wn_hyp.pl
hypernym pointers
.TP 20
.B wn_ent.pl
entailment pointers
.TP 20
.B wn_sim.pl
similar pointers
.TP 20
.B wn_mm.pl
member meronym pointers
.TP 20
.B wn_ms.pl
substance meronym pointers
.TP 20
.B wn_mp.pl
part meronym pointers
.TP 20
.B wn_cs.pl
cause pointers
.TP 20
.B wn_vgp.pl
grouped verb pointers
.TP 20
.B wn_at.pl
attribute pointers
.TP 20
.B wn_ant.pl
antonym pointers
.TP 20
.B wn_sa.pl
see also pointers
.TP 20
.B wn_ppl.pl
participle pointers
.TP 20
.B wn_per.pl
pertainym pointers
.TP 20
.B wn_fr.pl
frame pointers
.SH SEE ALSO
.BR wndb (5WN),
.BR wninput (5WN),
.BR wngroups (7WN),
.BR wnpkgs (7WN).

View File

@ -1,164 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH SENSEIDX 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
index.sense, sense.idx \- WordNet's sense index
.SH DESCRIPTION
The WordNet sense index provides an alternate method for accessing
synsets and word senses in the WordNet database. It is useful to
applications that retrieve synsets or other information related to a
specific sense in WordNet, rather than all the senses of a word or
collocation. It can also be used with tools like \fBgrep\fP and Perl
to find all senses of a word in one or more parts of speech. A
specific WordNet sense, encoded as a \fIsense_key\fP, can be used as
an index into this file to obtain its WordNet sense number, the
database byte offset of the synset containing the sense, and the
number of times it has been tagged in the semantic concordance texts.
Concatenating the \fIlemma\fP and \fIlex_sense\fP fields of a
semantically tagged word (represented in a \fB<wf~\fP...~\fB>\fP
attribute/value pair) in a semantic concordance file, using \fB%\fP as
the concatenation character, creates the \fIsense_key\fP for that
sense, which can in turn be used to search the sense index file.
A \fIsense_key\fP is the best way to represent a sense in semantic
tagging or other systems that refer to WordNet senses.
\fIsense_key\fPs are independent of WordNet sense numbers and
\fIsynset_offset\fPs, which vary between versions of the database.
Using the sense index and a \fIsense_key\fP, the corresponding synset
(via the \fIsynset_offset\fP) and WordNet sense number can easily be
obtained. A mapping from noun \fIsense_key\fPs in WordNet 1.6 to
corresponding 2.0 \fIsense_key\fPs is provided with version 2.0,
and is described in
.BR sensemap (5WN).
See
.BR wndb (5WN)
for a thorough discussion of the WordNet database files.
.SS File Format
The sense index file lists all of the senses in the WordNet database
with each line representing one sense. The file is in alphabetical
order, fields are separated by one space, and each line is terminated
with a newline character.
Each line is of the form:
.RS
\fIsense_key~~synset_offset~~sense_number~~tag_cnt\fP
.RE
\fIsense_key\fP is an encoding of the word sense. Programs can
construct a sense key in this format and use it as a binary search key
into the sense index file.
The format of a \fIsense_key\fP is
described below.
\fIsynset_offset\fP is the byte offset that the synset containing the
sense is found at in the database "data" file corresponding to the
part of speech encoded in the \fIsense_key\fP. \fIsynset_offset\fP is
an 8 digit, zero-filled decimal integer, and can be used with
.BR fseek (3)
to read a synset from the data file. When passed to the WordNet library
function \fBread_synset(\|)\fP along with the syntactic category, a data
structure containing the parsed synset is returned.
\fIsense_number\fP is a decimal integer indicating the sense number of
the word, within the part of speech encoded in \fIsense_key\fP, in the
WordNet database. See
.BR wndb (5WN)
for information about how sense numbers are assigned.
\fItag_cnt\fP represents the decimal number of times the sense is
tagged in various semantic concordance texts. A \fItag_cnt\fP of
\fB0\fP indicates that the sense has not been semantically tagged.
.SS Sense Key Encoding
A \fIsense_key\fP is represented as:
.RS
\fIlemma\fP\fB%\fP\fIlex_sense\fP
.RE
where \fIlex_sense\fP is encoded as:
.RS
\fIss_type\fB:\fIlex_filenum\fB:\fIlex_id\fB:\fIhead_word\fB:\fIhead_id\fR
.RE
\fIlemma\fP is the ASCII text of the word or collocation as found in
the WordNet database index file corresponding to \fIpos\fP.
\fIlemma\fP is in lower case, and collocations are formed by joining
individual words with an underscore (\fB_\fP) character.
\fIss_type\fP is a one digit decimal integer representing the synset type
for the sense. See
.SB "Synset Type"
below for a listing of the numbers corresponding to each synset type.
\fIlex_filenum\fP is a two digit decimal integer representing the
name of the lexicographer file containing the synset for the sense.
See
.BR lexnames (5WN)
for the list of lexicographer file names and their corresponding numbers.
\fIlex_id\fP is a two digit decimal integer that, when appended onto
\fIlemma\fP, uniquely identifies a sense within a lexicographer file.
\fIlex_id\fP numbers usually start with \fB00\fP, and are incremented
as additional senses of the word are added to the same file, although
there is no requirement that the numbers be consecutive or begin with
\fB00\fP. Note that a value of \fB00\fP is the default, and therefore
is not present in lexicographer files. Only non-default \fIlex_id\fP
values must be explicitly assigned in lexicographer files. See
.BR wninput (5WN)
for information on the format of lexicographer files.
\fIhead_word\fP is only present if the sense is in an adjective
satellite synset. It is the lemma of the first word of the
satellite's head synset.
\fIhead_id\fP is a two digit decimal integer that, when appended onto
\fIhead_word\fP, uniquely identifies the sense of \fIhead_word\fP
within a lexicographer file, as described for \fIlex_id\fP. There is
a value in this field only if \fIhead_word\fP is present.
.SS Synset Type
The synset type is encoded as follows:
.RS
.nf
\fB1\fP NOUN
\fB2\fP VERB
\fB3\fP ADJECTIVE
\fB4\fP ADVERB
\fB5\fP ADJECTIVE SATELLITE
.fi
.RE
.SH NOTES
For non-satellite senses the \fIhead_word\fP and \fIhead_id\fP fields
have no values, however the field separator character (\fB:\fP) is
present.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B index.sense
sense index
.SH SEE ALSO
.BR binsrch (3WN),
.BR wnsearch (3WN),
.BR lexnames (5WN),
.BR wnintro (5WN),
.BR sensemap (5WN),
.BR wndb (5WN),
.BR wninput (5WN).

View File

@ -1,28 +0,0 @@
'\" t
.\" $Id$
.TH UNIQBEG 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
uniqbeg \- unique beginners for noun hierarchies
.SH DESCRIPTION
All of the WordNet noun synsets are organized into hierarchies, headed
by the unique beginner synset for \fBentity\fP in the file
\fBnoun.Tops\fP.
.RS
.nf
{ entity (that which is perceived or known or inferred to have its own
distinct existence (living or nonliving)) }
.fi
.RE
.SH NOTES
The lexicographer files are not included in the WordNet database package.
.SH FILES
.TP 20
.B noun.Tops
unique beginners for nouns
.SH SEE ALSO
.BR wndb (5WN),
.BR wninput (5WN),
.BR wnintro (7WN),
.BR wngloss (7WN).

View File

@ -1,340 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WN 1WN "Dec 2006" "WordNet 3.0" "WordNet\(tm User Commands"
.SH NAME
wn \- command line interface to WordNet lexical database
.SH SYNOPSIS
\fBwn\fP [ \fIsearchstr\fP ] [ \fB\-h\fP] [ \fB\-g\fP ] [ \fB\-a\fP ] [ \fB\-l\fP ] [ \fB\-o\fP ] [ \fB\-s\fP ] [ \fB\-n\fI#\fR ] [ \fIsearch_option\fP... ]
.SH DESCRIPTION
\fBwn(\|)\fP provides a command line interface to the WordNet
database, allowing synsets and relations to be displayed as formatted
text. For each word, different searches are provided, based on
syntactic category and pointer types. Although only base forms of
words are usually stored in WordNet, users may search for inflected
forms. A morphological process is applied to the search string to
generate a form that is present in WordNet.
The command line interface is often useful when writing scripts to
extract information from the WordNet database. Post-processing of the
output with various scripting tools can reformat the results as
desired.
.SH OPTIONS
.TP 15
.B \-h
Print help text before search results.
.TP 15
.B \-g
Display textual glosses associated with synsets.
.TP 15
.B \-a
Display lexicographer file information.
.TP 15
.B \-o
Display synset offset of each synset.
.TP 15
.B \-s
Display each word's sense numbers in synsets.
.TP 15
.B \-l
Display the WordNet copyright notice, version number, and license.
.TP 15
.B \-n\fI#\fP
Perform search on sense number \fI#\fP only.
.TP 15
\fB-over\fP
Display overview of all senses of \fIsearchstr\fP in all syntactic
categories.
.SS Search Options
Note that the last letter of \fIsearch_option\fP generally denotes the
part of speech that the search applies to: \fBn\fP for nouns, \fBv\fP
for verbs, \fBa\fP for adjectives, and \fBr\fP for adverbs. Multiple
searches may be done for \fIsearchstr\fP with a single command by
specifying all the appropriate search options.
.TP 15
\fB\-syns\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
Display synonyms
and immediate hypernyms of synsets containing \fIsearchstr\fP.
Synsets are ordered by estimated frequency of use. For adjectives, if
\fIsearchstr\fP is in a head synset, the cluster's satellite synsets
are displayed in place of hypernyms. If \fIsearchstr\fP is in a
satellite synset, its head synset is also displayed.
.TP 15
\fB\-simsv\fP
Display verb synonyms and
immediate hypernyms of synsets containing \fIsearchstr\fP. Synsets
are grouped by similarity of meaning.
.TP 15
\fB\-ants\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
Display synsets containing antonyms of \fIsearchstr\fP.
For adjectives, if \fIsearchstr\fP is
in a head synset, \fIsearchstr\fP has a direct antonym.
The head synset for the direct antonym is displayed along
with the direct antonym's satellite synsets. If \fIsearchstr\fP is in a
satellite synset, \fIsearchstr\fP has an indirect antonym via the
head synset, which is displayed.
.TP 15
\fB\-faml\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
Display familiarity and polysemy information for \fIsearchstr\fP.
.TP 15
\fB\-hype\fP(\fIn\fP | \fIv\fP)
Recursively display hypernym (superordinate) tree for \fIsearchstr\fP
(\fIsearchstr\fP \fIIS A KIND OF _____\fP relation).
.TP 15
\fB\-hypo\fP(\fIn\fP | \fIv\fP)
Display immediate hyponyms (subordinates) for \fIsearchstr\fP
(\fI_____ IS A KIND OF\fP \fIsearchstr\fP relation).
.TP 15
\fB\-tree\fP(\fIn\fP | \fIv\fP)
Display hyponym (subordinate) tree for \fIsearchstr\fP. This is
a recursive search that finds the hyponyms of each hyponym.
.TP 15
\fB\-coor\fP(\fIn\fP | \fIv\fP)
Display the coordinates (sisters) of \fIsearchstr\fP. This
search prints the immediate hypernym for each synset that contains
\fIsearchstr\fP and the hypernym's immediate hyponyms.
.TP 15
\fB\-deri\fP(\fIn\fP | \fIv\fP)
Display derivational morphology links between noun and verb forms.
.TP 15
\fB\-domn\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
Display domain that \fIsearchstr\fP has been classified in.
.TP 15
\fB\-domt\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
Display all terms classified as members of the \fIsearchstr\fP's domain.
.TP 15
.B \-subsn
Display substance meronyms of \fIsearchstr\fP
(\fIHAS SUBSTANCE\fP relation).
.TP 15
.B \-partn
Display part meronyms of \fIsearchstr\fP
(\fIHAS PART\fP relation).
.TP 15
.B \-membn
Display member meronyms of \fIsearchstr\fP
(\fIHAS MEMBER\fP relation).
.TP 15
.B \-meron
Display all meronyms of \fIsearchstr\fP
(\fIHAS PART, HAS MEMBER, HAS SUBSTANCE\fP relations).
.TP 15
.B \-hmern
Display meronyms for \fIsearchstr\fP tree. This is a recursive search
that prints all the meronyms of \fIsearchstr\fP and all of
its hypernyms.
.TP 15
.B \-sprtn
Display \fIpart of\fP holonyms of \fIsearchstr\fP
(\fIPART OF\fP relation).
.TP 15
.B \-smemn
Display \fImember of\fP holonyms of \fIsearchstr\fP
(\fIMEMBER OF\fP relation).
.TP 15
.B \-ssubn
Display \fIsubstance of\fP holonyms of \fIsearchstr\fP
(\fISUBSTANCE OF\fP relation).
.TP 15
.B \-holon
Display all holonyms of \fIsearchstr\fP
(\fIPART OF, MEMBER OF, SUBSTANCE OF\fP relations).
.TP 15
.B \-hholn
Display holonyms for \fIsearchstr\fP tree. This is a recursive search
that prints all the holonyms of \fIsearchstr\fP and all of each
holonym's holonyms.
.TP 15
.B \-entav
Display entailment relations of \fIsearchstr\fP.
.TP 15
.B \-framv
Display applicable verb sentence frames for \fIsearchstr\fP.
.TP 15
.B \-causv
Display \fIcause to\fP relations of \fIsearchstr\fP.
.TP 15
\fB \-pert\fP(\fIa\fP | \fIr\fP)
Display pertainyms of \fIsearchstr\fP.
.TP 15
\fB \-attr\fP(\fIn\fP | \fIa\fP)
Display adjective values for noun attribute, or noun attributes of
adjective values.
.TP 15
\fB\-grep\fP(\fIn\fP | \fIv\fP | \fIa\fP | \fIr\fP)
List compound words containing \fIsearchstr\fP as a substring.
.SH SEARCH RESULTS
The results of a search are written to the standard output. For each
search, the output consists a one line description of the search,
followed by the search results.
All searches other than \fB\-over\fP list all senses matching the
search results in the following general format. Items enclosed in
italicized square brackets (\fI[~...~]\fP) may not be present.
.RS
One line listing the number of senses matching the search request.
Each sense matching the search requested displayed as follows:
.nf
\fBSense \fIn\fR
\fI[\fB{\fIsynset_offset\fB}\fI] [\fB<\fIlex_filename\fB>\fI]~~word1[\fB#\fIsense_number][,~~word2...]\fR
.fi
Where \fIn\fP is the sense number of the search word,
\fIsynset_offset\fP is the byte offset of the synset in the
\fBdata.\fIpos\fR file corresponding to the syntactic category,
\fIlex_filename\fP is the name of the lexicographer file that the
synset comes from, \fIword1\fP is the first word in the synset (note
that this is not necessarily the search word) and \fIsense_number\fP
is the WordNet sense number assigned to the preceding word.
\fIsynset_offset, lex_filename\fP, and \fIsense_number\fP are
generated when the \fB\-o, \-a,\fP and \fB\-s\fP options,
respectively, are specified.
The synsets matching the search requested are printed below each sense's
synset output described above. Each line of output is preceded by a
marker (usually \fB=>\fP), then a synset, formatted as described
above. If a search traverses more one level of the tree, then
successive lines are indented by spaces corresponding to its level in
the hierarchy. When the \fB\-g\fP option is specified, synset glosses
are displayed in parentheses at the end of each synset. Each synset
is printed on one line.
Senses are generally ordered from most to least frequently used, with
the most common sense numbered \fB1\fP. Frequency of use is
determined by the number of times a sense is tagged in the various
semantic concordance texts. Senses that are not semantically tagged
follow the ordered senses. Note that this ordering is only an
estimate based on usage in a small corpus.
Verb senses can be grouped by similarity of meaning, rather
than ordered by frequency of use. The \fB\-simsv\fP search prints all
senses that are close in meaning together, with a line of dashes
indicating the end of a group. See
.BR wngroups (7WN)
for a discussion of how senses are grouped.
The \fB\-over\fP search displays an overview of all the senses of the
search word in all syntactic categories. The results of this search
are similar to the \fB\-syns\fP search, however no additional
(ex. hypernym) synsets are displayed, and synset glosses are always
printed. The senses are grouped by syntactic category, and each
synset is annotated as described above with \fIsynset_offset\fP,
\fIlex_filename\fP, and \fIsense_number\fP as dictated by the
\fB\-o, \-a,\fP and \fB\-s\fP options. The overview search also
indicates how many of the senses in each syntactic category are
represented in the tagged texts. This is a way for the user to
determine whether a sense's sense number is based on semantic tagging
data, or was arbitrarily assigned. For each sense that has
appeared in such texts, the number of semantic tags to that sense are
indicated in parentheses after the sense number.
If a search cannot be performed on some senses of \fIsearchstr\fP, the
search results are headed by a string of the form:
.nf
X of Y senses of \fIsearchstr\fP
.fi
The output of the \fB\-deri\fP search shows word forms that are
morphologically related to \fBsearchstr\fP. Each word form pointed to
from \fIsearchstr\fP is displayed, preceded by \fBRELATED TO->\fP and
the syntactic category of the link, followed, on the next line, by its
synset. Printed after the word form is \fB#\fP\fIn\fP where \fIn\fP
indicates the WordNet sense number of the term pointed to.
The \fB\-domn\fP and \fB\-domt\fP searches show the domain that a
synset has been classified in and, conversely, all of the terms that
have been assigned to a specific domain. A domain is
either a \fBTOPIC,\fP \fBREGION\fP or \fBUSAGE,\fP as reflected in
the specific pointer character stored in the database, and displayed
in the output. A \fB\-domn\fP search on a term shows the domain, if
any, that each synset containing \fIsearchstr\fP has been classified
in. The output display shows the domain type (\fBTOPIC,\fP
\fBREGION\fP or \fBUSAGE\fP), followed by the syntactic category of
the domain synset and the terms in the synset. Each term is followed
by \fB#\fP\fIn\fP where \fIn\fP indicates the WordNet sense number of
the term. The converse search, \fB\-domt\fP, shows all of the synsets
that have been placed into the domain \fIsearchstr\fP, with analogous
markers.
When \fB\-framv\fP is specified, sample illustrative sentences and
generic sentence frames are displayed. If a sample sentence is found,
the base form of \fIsearch\fP is substituted into the sentence, and it
is printed below the synset, preceded with the \fBEX:\fP marker. When
no sample sentences are found, the generic sentence frames are
displayed. Sentence frames that are acceptable for all words in a
synset are preceded by the marker \fB*>\fP. If a frame is acceptable
for the search word only, it is preceded by the marker \fB=>\fP.
Search results for adjectives are slightly different from those for
other parts of speech. When an adjective is printed, its direct
antonym, if it has one, is also printed in parentheses. When
\fIsearchstr\fP is in a head synset, all of the head synset's
satellites are also displayed. The position of an adjective in
relation to the noun may be restricted to the \fIprenominal\fP,
\fIpostnominal\fP or \fIpredicative\fP position. Where present, these
restrictions are noted in parentheses.
When an adjective is a participle of a verb, the output indicates the
verb and displays its synset.
When an adverb is derived from an adjective, the specific adjectival
sense on which it is based is indicated.
The morphological transformations performed by the search code may
result in more than one word to search for. WordNet automatically
performs the requested search on all of the strings and returns the
results grouped by word. For example, the verb \fBsaw\fP is both the
present tense of \fBsaw\fP and the past tense of \fBsee\fP. When
passed \fIsearchstr\fP \fBsaw\fP, WordNet performs the desired search
first on \fBsaw\fP and next on \fBsee\fP, returning the list of
\fBsaw\fP senses and search results, followed by those for \fBsee\fP.
.SH EXIT STATUS
\fBwn(\|)\fP normally exits with the number of senses displayed. If
\fIsearchword\fP is not found in WordNet, it exits with \fB0\fP.
If the WordNet database cannot be opened, an error messages is
displayed and \fBwn(\|)\fP exits with \fB-1\fP.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B index.\fIpos\fP
database index files
.TP 20
.B data.\fIpos\fP
database data files
.TP 20
.B *.vrb
files of sentences illustrating the use of verbs
.TP 20
.B \fIpos\fP.exc
morphology exception lists
.SH SEE ALSO
.BR wnintro (1WN),
.BR wnb (1WN),
.BR wnintro (3WN),
.BR lexnames (5WN),
.BR senseidx (5WN)
.BR wndb (5WN),
.BR wninput (5WN),
.BR morphy (7WN),
.BR wngloss (7WN),
.BR wngroups (7WN).
.SH BUGS
Please report bugs to wordnet@princeton.edu.

View File

@ -1,461 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNB 1WN "Dec 2006" "WordNet 3.0" "WordNet\(tm User Commands"
.SH NAME
wnb \- WordNet window-based browser interface
.SH SYNOPSIS
.LP
\fBwnb\fP
.SH DESCRIPTION
\fBwnb(\|)\fP provides a window-based interface for browsing the
WordNet database, allowing synsets and relations to be displayed as
formatted text. For each search word, different searches are
available based on syntactic category and information available in the
database.
\fBwnb\fP is written in Tcl/Tk, which is available for Unix and Windows
platforms. This allows the same code to work on all
supported WordNet platforms without modification.
.SH WNB WINDOWS
\fBwnb(\|)\fP was developed with the philosophy that only those
searches and buttons that are applicable at the current time are
displayed. As a result, the appearance of the interface changes as it
is used. Use the standard windowing system mouse functions to open
and close the WordNet Browser Window, move the window, and change its
size.
The WordNet Browser Window contains the following areas, from top to
bottom:
.TP 20
Menubar
A menubar runs along the top of the browser window with pulldown menus
and button
entitled \fBFile\fP, \fBHistory\fP, \fBOptions\fP, and \fBHelp\fP.
.TP 20
Search Word Entry
Below the Menubar is a line for entering the search
word. A search word can be a single word, hyphenated string, or a
collocation. Case is ignored. Although only uninflected forms of
words are usually stored in WordNet, users may search for inflected
forms. WordNet's morphological processor finds the base form
automatically.
.TP 20
Search Selection
Below the Search Word Entry line is an area for selecting the search
type and senses to search. Until a search word is entered this area
is blank. After a search word is entered, buttons appear
corresponding to each syntactic category (\fBNoun\fP, \fBVerb\fP,
\fBAdjective\fP, \fBAdverb\fP) in which the search string is defined
in WordNet.
At the right edge of the Search Selection line is a box for entering
sense numbers. When this box is empty, search results for all senses
of the search word that match the search type are displayed. The
search may be restricted to one or more specific senses by entering a
comma or space separated list of sense numbers in the \fBSenses\fP
box. These sense numbers remain in effect until either the user
changes or deletes them, or a new search word is entered.
.TP 20
Results Window
Most of the browser window consists of a large text buffer for
displaying the results of WordNet searches. Horizontal and vertical
scroll bars are present for scrolling through the output.
.TP 20
Status Line
A status line is at the bottom of the browser window.
When search results are displayed in the Results Window, this status
line reflects the type of search selected. When there is no search
word entered, your are prompted to \fB"Enter search word and press
return."\fP If the search word entered is not in WordNet, the message
\fB"Sorry, no matches found."\fP is displayed.
.SH SEARCHING THE DATABASE
The WordNet browser navigates through WordNet in two steps. First a
search word is entered and an overview of all the senses of the word
in all syntactic categories is displayed in the Results Window.
The senses are
grouped by syntactic category, and each synset is annotated as
described above with \fIsynset_offset\fP, \fIlex_filename\fP, and
\fIsense_number\fP as dictated by the advanced search options set.
The overview search also indicates how many of the senses in each
syntactic category are represented in the tagged texts. This is a way
for the user to determine whether a sense's sense number is based on
semantic tagging data, or was arbitrarily assigned. For each sense that
has appeared in such texts, the number of semantic tags to that sense
are indicated in parentheses after the sense number.
Then, within a syntactic category, a specific search is selected. The
desired search is performed and the search results are displayed in
the Results Window. Additional searches on the same word can be
performed, or a new search word can be entered.
To enter a search word, click the mouse in the horizontal box labeled
\fBSearch Word\fP, type a single word, hyphenated string, or
collocation and press
.SB RETURN.
\fBwnb(\|)\fP responds by making a set of Part of Speech buttons appear in
the Search Selection line. Each button corresponds to a syntactic
category in which the search string is defined in WordNet. At the
same time, an Overview of the synsets for all senses of the search
word is displayed in the Results Window. The Overview includes the
gloss for each synset and also indicates which of the senses have
appeared in the semantically tagged texts. For each sense that has
appeared in such texts, the number of semantic tags to that sense are
indicated in parentheses after the sense number.
The pulldown menus in the Search Selection line list all of the
WordNet searches that can be performed for the search word in that
part of speech. To select a search, highlight it by dragging the
mouse to it, and release the mouse while it is highlighted. Drag the
mouse outside of the pulldown list and release to hide the menu
without making a selection. Dragging the mouse across the Part of
Speech buttons displays the available searches for each syntactic
category.
To restrict a search to one or more senses within a syntactic
category, enter a comma or space separated list of sense numbers in
the \fBSenses\fP box before selecting a search.
After a search is selected, \fBwnb(\|)\fP performs the search on the
WordNet database and displays the formatted results in the Results
Window. Whenever search results are displayed, a button entitled
\fBRedisplay Overview\fP is present at the right edge of the Search Word
Entry line. Clicking on this button redisplays the Overview of all
synsets for the search word in the Results Window.
.SS Changing the Search Word
A new search word can be entered at any time by moving to the Search
Word Entry box, if necessary highlighting it by clicking, erasing the
old string, typing a new one and pressing
.SB RETURN.
The \fBSenses\fP box is cleared if necessary, the Part of Speech buttons
applicable to the new search word appear, and the Overview for the new
search word is displayed.
The middle mouse button can also be used to select a new search word
by placing the mouse over any word in the Results Window and
clicking. The selected word will replace the text in the Search Word
Entry box, and the overview for that word will automatically be
displayed.
To select a new search string collocation from text in the
Results Window, highlight the text with the mouse and press
.SB CONTROL-S.
.SS Interrupting a Search
When a search is in progress the message \fB"Searching...(press escape
to abort)"\fP is displayed in the Status Line. Note that most
searches return very quickly, so this message isn't noticeable. As
indicated, pressing the
.SB ESCAPE
key will interrupt the search. The results of the search obtained
before the time the search was interrupted are displayed in the
Results Window.
.SH MENUS
.SS File Menu
.RS
.IP "Find keywords by substring"
Display a popup window for specifying a search of WordNet for words or
collocations that contain a specific substring. If a search word is
currently entered in the \fBSearch Word\fP box, it is used as the
substring to search for by default. The Substring Search Window
contains a box for entering a substring, a pulldown menu to its right
for specifying the part of speech to search, a large area for
displaying the search results, and action buttons at the bottom
entitled \fBSearch\fP, \fBSave\fP, \fBPrint\fP \fBDismiss\fP.
Once a substring is entered and a part of speech selected, clicking on
the \fBSearch\fP button causes a search to be done for all words and
collocations in WordNet, in that syntactic category, that contain the
substring according to the following criteria:
1. The substring can appear at the beginning or end of a word, hyphenated
string o collocation.
2. The substring can appear in the middle of a hyphenated string or
collocation, but only delimited on both sides by spaces or
hyphens.
The search results are displayed in the large buffer. Clicking on an
item from the search results list causes \fBwnb(\|)\fP to automatically
enter that word in the \fBSearch Word\fP box of the WordNet Browser
Window and perform the Overview search.
Clicking the \fBSave\fP button generates a popup dialog for specifying
a filename to save the substring search results to. Clicking the
\fBPrint\fP button generates a popup dialog in which a print command
can be specified.
Selecting \fBDismiss\fP closes the Substring Search Window.
.IP "Save current display"
Display a popup dialog for specifying a filename to save the current
Results Window contents to.
.IP "Print current display"
Display a popup dialog in which to specify a print command to which
the current Results Window contents can be piped. Note - this option
does not exist in the Windows version.
.IP "Clear current display"
Clear the \fBSearch Word\fP and \fBSenses\fP boxes, and Results
Window.
.IP "Exit"
Does what you would expect.
.RE
.SS History
This pulldown menu contains a list of the last searches performed.
Selecting an item from this list performs that search again. The
maximum number of searches stored in the list can be adjusted from the
\fBOptions\fP menu. The default is 10.
.SS Options
.RS
.IP "Show help with each search"
When this checkbox is selected search results are preceded by some
explanatory text about the type of search selected. This is off by
default.
.IP "Show descriptive gloss"
When this checkbox is selected, synset glosses are displayed in all
search results. This is set by default. Note that glosses are always
displayed in the Overview.
.IP "Wrap Lines"
When this checkbox is selected, lines in the Results Window that are
wider than the window are automatically wrapped. This is set by
default. If not selected, a horizontal scroll bar is present if any
lines are longer than the width of the window.
.IP "Set advanced search options..."
Selecting this item displays a popup window for setting the following
search options: \fBLexical file information; Synset location in database
file; Sense number\fP. Choices for each are:
.nf
\fBDon't show\fP (default)
\fBShow with searches\fP
\fBShow with searches and overview\fP
.fi
When lexical file information is shown, the name of the lexicographer
file is printed before each synset, enclosed in angle brackets
(\fB<~~\fI...\fB~~>\fR). When both lexical file information and
synset location information are displayed, the synset location
information appears first. If within one lexicographer file more than
one sense of a word is entered, an integer \fIlex_id\fP is appended
onto all but one of the word's instances to uniquely identify it. In
each synset, each word having a non-zero \fIlex_id\fP is printed with
the \fIlex_id\fP value printed immediately following the word. If
both lexicographer information and sense numbers are displayed,
\fIlex_id\fPs, if present, precede sense numbers.
When synset location is shown, the byte offset of the synset in the
database "data" file corresponding to the syntactic category of the
synset is printed before each synset, enclosed in curly braces
(\fB{~~\fI...\fB~~}\fR). When both lexical file information and
synset location information are displayed, the synset location
information appears first.
When sense numbers are shown, the sense number of each word in each
synset is printed immediately after the word, and is preceded by a
number sign (\fB#\fP).
.IP "Set maximum history length..."
Display a popup dialog in which the maximum number of previous
searches to be kept on the History list can be set.
.IP "Set font...~~~~~~~~~~~"
Display a popup window for setting the font (typeface) and font size
to use for the Results Window. Choices for typeface are: \fBCourier\fP,
\fBHelvetica\fP, and \fBTimes\fP (default). Font size can be
\fBsmall\fP, \fBmedium\fP (default), or \fBlarge\fP.
.IP "Save current options as default"
Save the currently set options. Next time the browser is started,
these options will be used as the user defaults.
.RE
.SS Help
.RS
.IP "Help on using the WordNet browser"
Display this manual page.
.IP "Help on WordNet terminology"
Display the
.BR wngloss (7WN)
manual page.
.IP "Display the WordNet license"
Display the WordNet copyright notice and license agreement.
.IP "About the WordNet browser"
Information about this application.
.RE
.SH SHORCUTS
Clicking on any word in the Results Window while holding down the
.SB SHIFT
key on the keyboard causes the browser to replace \fBSearch
Word\fP with the word and display its Overview and available searches.
Clicking on any word in the Results Window with the middle mouse
button does the same thing.
Pressing the
.SB CONTROL-S
keys causes the browser to do as above on the text that is currently
highlighted. Under Unix, this will work even if the highlighted text
is in another window. This works on
hyphenated strings and collocations, as well as individual words.
Pressing the
.SB CONTROL-G
keys displays the Substring Search Window.
.SH SEARCH RESULTS
The results of a search of the WordNet database are displayed in the
Results Window. Horizontal and vertical scroll bars are present for
scrolling through the search results.
All searches other than the Overview list all senses matching the
search results in the following general format. Items enclosed in
italicized square brackets (\fI[~...~]\fP) may not be present.
If a search cannot be performed on some senses of \fIsearchstr\fP, the
search results are headed by a string of the form:
.nf
X of Y senses of \fIsearchstr\fP
.fi
.RS
One line listing the number of senses matching the search selected.
Each sense matching the search selected displayed as follows:
.nf
\fBSense \fIn\fR
\fI[\fB{\fIsynset_offset\fB}\fI] [\fB<\fIlex_filename\fB>\fI]~~word1[\fB#\fIsense_number][,~~word2...]\fR
.fi
Where \fIn\fP is the sense number of the search word,
\fIsynset_offset\fP is the byte offset of the synset in the
\fBdata.\fIpos\fR file corresponding to the syntactic category,
\fIlex_filename\fP is the name of the lexicographer file that the
synset comes from, \fIword1\fP is the first word in the synset (note
that this is not necessarily the search word) and \fIsense_number\fP
is the WordNet sense number assigned to the preceding word.
\fIsynset_offset\fP, \fIlex_filename\fP, and \fIsense_number\fP are
generated if the appropriate Options are specified.
The synsets matching the search selected are printed below each
sense's synset output described above. Each line of output is
preceded by a marker (usually \fB=>\fP), then a synset, formatted as
described above. If a search traverses more one level of the tree,
then successive lines are indented by spaces corresponding to its
level in the hierarchy. Glosses are displayed in parentheses at the
end of each synset if the appropriate Option is set. Each synset is
printed on one line.
Senses are ordered from most to least frequently used, with
the most common sense numbered \fB1\fP. Frequency of use is
determined by the number of times a sense is tagged in the various
semantic concordance texts. Senses that are not semantically tagged
follow the ordered senses. Note that this ordering is only an
estimate based on usage in a small corpus.
Verb senses can be grouped by similarity of meaning, rather
than ordered by frequency of use. When the \fB"Synonyms, grouped by
similarity"\fP search is selected, senses that are close
in meaning are printed together, with a line of dashes indicating the
end of a group. See
.BR wngroups (7WN)
for a discussion how senses are grouped.
The output of the \fB"Derivationally Related Forms"\fP
search shows word forms that are
morphologically related to \fBsearchstr\fP. Each word form pointed to
from \fIsearchstr\fP is displayed, preceded by \fBRELATED TO->\fP and
the syntactic category of the link, followed, on the next line, by its
synset. Printed after the word form is \fB#\fP\fIn\fP where \fIn\fP
indicates the WordNet sense number of the term pointed to.
The \fB"Domain"\fP and \fB"Domain Terms"\fP searches show the domain that a
synset has been classified in and, conversely, all of the terms that
have been assigned to a specific domain. A domain is
either a \fBTOPIC,\fP \fBREGION\fP or \fBUSAGE,\fP as reflected in
the specific pointer character stored in the database, and displayed
in the output. A \fBDomain\fP search on a term shows the domain, if
any, that each synset containing \fIsearchstr\fP has been classified
in. The output display shows the domain type (\fBTOPIC,\fP
\fBREGION\fP or \fBUSAGE\fP), followed by the syntactic category of
the domain synset and the terms in the synset. Each term is followed
by \fB#\fP\fIn\fP where \fIn\fP indicates the WordNet sense number of
the term. The converse search, \fBDomain Terms\fP, shows all of the synsets
that have been placed into the domain \fIsearchstr\fP, with analogous
markers.
When the \fB"Sentence Frames"\fP search is specified, sample
illustrative sentences and generic sentence frames are displayed. If
a sample sentence is found, the base form of the search word is
substituted into the sentence, and it is printed below the synset,
preceded with the \fBEX:\fP marker. When no sample sentences are
found, the generic sentence frames are displayed. Sentence frames
that are acceptable for all words in a synset are preceded by the
marker \fB*>\fP. If a frame is acceptable for the search word only,
it is preceded by the marker \fB=>\fP.
Search results for adjectives are slightly different from those for
other parts of speech. When an adjective is printed, its direct
antonym, if it has one, is also printed in parentheses. When
the search word is in a head synset, all of the head synset's
satellites are also displayed. The position of an adjective in
relation to the noun may be restricted to the \fIprenominal\fP,
\fIpostnominal\fP or \fIpredicative\fP position. Where present, these
restrictions are noted in parentheses.
When an adjective is a participle of a verb, the output indicates the
verb and displays its synset.
When an adverb is derived from an adjective, the specific adjectival
sense on which it is based is indicated.
The morphological transformations performed by the search code may
result in more than one word to search for. \fBwnb(\|)\fP
automatically performs the requested search on all of the strings and
returns the results grouped by word. For example, the verb \fBsaw\fP
is both the present tense of \fBsaw\fP and the past tense of
\fBsee\fP. When there is more than one word to search for, search
results are grouped by word.
.SH DIAGNOSTICS
If the WordNet database files cannot be opened, error messages are
displayed. This is usually corrected by setting the environment
variables described below to the proper location of the WordNet
database for your installation.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.TP 20
.B HKEY_CURRENT_USER\eSOFTWARE\eWordNet\e3.0\ewnres
User's default browser options.
.SH FILES
.TP 20
.B index.\fIpos\fP
database index files
.TP 20
.B data.\fIpos\fP
database data files
.TP 20
.B *.vrb
files of sentences illustrating the use of verbs
.TP 20
.B \fIpos\fP.exc
morphology exception lists
.SH SEE ALSO
.BR wnintro (1WN),
.BR wn (1WN),
.BR wnintro (3WN),
.BR lexnames (5WN),
.BR senseidx (5WN),
.BR wndb (5WN),
.BR wninput (5WN),
.BR morphy (7WN),
.BR wngloss (7WN),
.BR wngroups (7WN).
.SH BUGS
Please reports bugs to wordnet@princeton.edu.

View File

@ -1,362 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNDB 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
index.noun, data.noun, index.verb, data.verb, index.adj, data.adj, index.adv, data.adv \- WordNet database files
.LP
noun.exc, verb.exc. adj.exc adv.exc \- morphology exception lists
.LP
sentidx.vrb, sents.vrb \- files used by search code to display
sentences illustrating the use of some specific verbs
.SH DESCRIPTION
For each syntactic category, two files are needed to represent the
contents of the WordNet database \- \fBindex.\fP\fIpos\fP and
\fBdata.\fP\fIpos\fP, where \fIpos\fP is \fBnoun\fP, \fBverb\fP,
\fBadj\fP and \fBadv\fP. The other auxiliary files are used by the
WordNet library's searching functions and are needed to run the
various WordNet browsers.
Each index file is an alphabetized list of all the words found in
WordNet in the corresponding part of speech. On each line, following
the word, is a list of byte offsets (\fIsynset_offset\fPs) in the
corresponding data file, one for each synset containing the word.
Words in the index file are in lower case only, regardless of how they
were entered in the lexicographer files. This folds various
orthographic representations of the word into one line enabling
database searches to be case insensitive. See
.BR wninput (5WN)
for a detailed description of the lexicographer files
A data file for a syntactic category contains information
corresponding to the synsets that were specified in the lexicographer
files, with relational pointers resolved to \fIsynset_offset\fPs.
Each line corresponds to a synset. Pointers are followed and
hierarchies traversed by moving from one synset to another via the
\fIsynset_offset\fPs.
The exception list files, \fIpos\fP\fB.exc\fP, are used to help the
morphological processor find base forms from irregular inflections.
The files \fBsentidx.vrb\fP and \fBsents.vrb\fP contain sentences
illustrating the use of specific senses of some verbs. These files
are used by the searching software in response to a request for verb
sentence frames. Generic sentence frames are displayed when an
illustrative sentence is not present.
The various database files are in ASCII formats that are easily read
by both humans and machines. All fields, unless otherwise noted, are
separated by one space character, and all lines are terminated by a
newline character. Fields enclosed in italicized square brackets may
not be present.
See
.BR wngloss (7WN)
for a glossary of WordNet terminology and a discussion of the
database's content and logical organization.
.SS Index File Format
Each index file begins with several lines containing a copyright
notice, version number and license agreement. These lines all begin
with two spaces and the line number so they do not interfere with the
binary search algorithm that is used to look up entries in the index
files. All other lines are in the following format. In the field
descriptions, \fBnumber\fP always refers to a decimal integer unless
otherwise defined.
.nf
\fIlemma~~pos~~synset_cnt~~p_cnt~~[ptr_symbol...]~~sense_cnt~~tagsense_cnt ~~synset_offset~~[synset_offset...]\fP
.fi
.TP 15
.I lemma
lower case ASCII text of word or collocation. Collocations are formed
by joining individual words with an underscore (\fB_\fP) character.
.TP 15
.I pos
Syntactic category: \fBn\fP for noun files,
\fBv\fP for verb files, \fBa\fP for adjective files, \fBr\fP for
adverb files.
.LP
All remaining fields are with respect to senses of \fIlemma\fP in
\fIpos\fP.
.TP 15
.I synset_cnt
Number of synsets that \fIlemma\fP is in. This is the
number of senses of the word in WordNet. See
.SM \fBSense Numbers\fP
below for a discussion of how sense numbers are assigned and the order
of \fIsynset_offset\fPs in the index files.
.TP 15
.I p_cnt
Number of different pointers that \fIlemma\fP has in all synsets
containing it.
.TP 15
.I ptr_symbol
A space separated list of \fIp_cnt\fP different types of pointers that
\fIlemma\fP has in all synsets containing it. See
.BR wninput (5WN)
for a list of \fIpointer_symbol\fPs. If all senses of \fIlemma\fP
have no pointers, this field is omitted and \fIp_cnt\fP is \fB0\fP.
.TP 15
.I sense_cnt
Same as \fIsense_cnt\fP above. This is redundant, but the field was
preserved for compatibility reasons.
.TP 15
.I tagsense_cnt
Number of senses of \fIlemma\fP that are ranked according to
their frequency of occurrence in semantic concordance texts.
.TP 15
.I synset_offset
Byte offset in \fBdata.\fIpos\fR file of a synset containing
\fIlemma\fP. Each \fIsynset_offset\fP in the list corresponds to a
different sense of \fIlemma\fP in WordNet. \fIsynset_offset\fP is an
8 digit, zero-filled decimal integer that can be used with
.BR fseek (3)
to read a synset from the data file. When passed to
.BR read_synset (3WN)
along with the syntactic category, a data structure containing the
parsed synset is returned.
.SS Data File Format
Each data file begins with several lines containing a copyright
notice, version number and license agreement. These lines all begin
with two spaces and the line number. All other lines are in the
following format. Integer fields are of fixed length, and are
zero-filled.
.nf
\fIsynset_offset~~lex_filenum~~ss_type~~w_cnt~~word~~lex_id~~[word~~lex_id...]~~p_cnt~~[ptr...]~~[frames...]~~\fB|\fP\fI~~gloss\fP
.fi
.TP 15
.I synset_offset
Current byte offset in the file represented as an 8 digit decimal
integer.
.TP 15
.I lex_filenum
Two digit decimal integer corresponding to the lexicographer file name
containing the synset. See
.BR lexnames (5WN)
for the list of filenames and their corresponding numbers.
.TP 15
.I ss_type
One character code indicating the synset type:
.RS
.nf
\fBn\fP NOUN
\fBv\fP VERB
\fBa\fP ADJECTIVE
\fBs\fP ADJECTIVE SATELLITE
\fBr\fP ADVERB
.fi
.RE
.TP 15
.I w_cnt
Two digit hexadecimal integer indicating the number of words in the
synset.
.TP 15
.I word
ASCII form of a word as entered in the synset by the lexicographer,
with spaces replaced by underscore characters (\fB_\fP). The text of
the word is case sensitive, in contrast to its form in the
corresponding \fBindex.\fP\fIpos\fP file, that contains only
lower-case forms. In \fBdata.adj\fP, a \fIword\fP is followed by a
syntactic marker if one was specified in the lexicographer file. A
syntactic marker is appended, in parentheses, onto \fIword\fP without
any intervening spaces. See
.BR wninput (5WN)
for a list of the syntactic markers for adjectives.
.TP 15
.I lex_id
One digit hexadecimal integer that, when appended onto \fIlemma\fP,
uniquely identifies a sense within a lexicographer file. \fIlex_id\fP
numbers usually start with \fB0\fP, and are incremented as additional
senses of the word are added to the same file, although there is no
requirement that the numbers be consecutive or begin with \fB0\fP.
Note that a value of \fB0\fP is the default, and therefore is not
present in lexicographer files.
.TP 15
.I p_cnt
Three digit decimal integer indicating the number of pointers from
this synset to other synsets. If \fIp_cnt\fP is \fB000\fP the synset
has no pointers.
.TP 15
.I ptr
A pointer from this synset to another. \fIptr\fP is of the form:
.nf
\fIpointer_symbol~~synset_offset~~pos~~source/target\fR
.fi
where \fIsynset_offset\fP is the byte offset of the target synset in
the data file corresponding to \fIpos\fP.
The \fIsource/target\fP field distinguishes lexical and semantic
pointers. It is a four byte field, containing two two-digit
hexadecimal integers. The first two digits indicates the word number
in the current (source) synset, the last two digits indicate the word
number in the target synset. A value of \fB0000\fP means that
\fIpointer_symbol\fP represents a semantic relation between the
current (source) synset and the target synset indicated by
\fIsynset_offset\fP.
A lexical relation between two words in different synsets is
represented by non-zero values in the source and target word numbers.
The first and last two bytes of this field indicate the word numbers
in the source and target synsets, respectively, between which the
relation holds. Word numbers are assigned to the \fIword\fP fields in
a synset, from left to right, beginning with \fB1\fP.
See
.BR wninput (5WN)
for a list of \fIpointer_symbol\fPs, and semantic and lexical pointer
classifications.
.TP 15
.I frames
In \fBdata.verb\fP only, a list of numbers corresponding to the
generic verb sentence frames for \fIword\fPs in the synset.
\fIframes\fP is of the form:
.nf
\fIf_cnt~~\fP \fB+\fP \fI~~f_num~~w_num~~[\fP \fB+\fP \fI~~f_num~~w_num...]\fP
.fi
where \fIf_cnt\fP a two digit decimal integer indicating the number of
generic frames listed, \fIf_num\fP is a two digit decimal integer
frame number, and \fIw_num\fP is a two digit hexadecimal integer
indicating the word in the synset that the frame applies to. As with
pointers, if this number is \fB00\fP, \fIf_num\fP applies to all
\fIword\fPs in the synset. If non-zero, it is applicable only to the
word indicated. Word numbers are assigned as described for pointers.
Each \fIf_num~~w_num\fP pair is preceded by a \fB+\fP.
See
.BR wninput (5WN)
for the text of the generic sentence frames.
.TP
.I gloss
Each synset contains a gloss. A \fIgloss\fP is represented as a
vertical bar (\fB|\fP), followed by a text string that continues until
the end of the line. The gloss may contain a definition, one or more
example sentences, or both.
.SS Sense Numbers
Senses in WordNet are generally ordered from most to least frequently
used, with the most common sense numbered \fB1\fP. Frequency of use is
determined by the number of times a sense is tagged in the various
semantic concordance texts. Senses that are not semantically tagged
follow the ordered senses. The \fItagsense_cnt\fP field for each
entry in the \fBindex.\fIpos\fR files indicates how many of the senses
in the list have been tagged.
The
.BR cntlist (5WN)
file provided with the database lists the number of times each sense
is tagged in the semantic concordances. The data from \fBcntlist\fP
is used by
.BR grind (1WN)
to order the senses of each word. When the \fBindex\fP.\fIpos\fP
files are generated, the \fIsynset_offset\fPs are output in sense
number order, with sense 1 first in the list. Senses with the same
number of semantic tags are assigned unique but consecutive sense
numbers. The WordNet
.SB OVERVIEW
search displays all senses of the
specified word, in all syntactic categories, and indicates which of
the senses are represented in the semantically tagged texts.
.SS Exception List File Format
Exception lists are alphabetized lists of inflected forms of words and
their base forms. The first field of each line is an inflected form,
followed by a space separated list of one or more base forms of the
word. There is one exception list file for each syntactic category.
Note that the noun and verb exception lists were automatically
generated from a machine-readable dictionary, and contain many words
that are not in WordNet. Also, for many of the inflected forms, base
forms could be easily derived using the standard rules of detachment
programmed into Morphy (See
.BR morph (7WN)).
These anomalies are allowed to remain in the exception list files,
as they do no harm.
.SS Verb Example Sentences
For some verb senses, example sentences illustrating the use of the
verb sense can be displayed. Each line of the file \fBsentidx.vrb\fP
contains a \fIsense_key\fP followed by a space and a comma separated
list of example sentence template numbers, in decimal. The file
\fBsents.vrb\fP lists all of the example sentence templates. Each
line begins with the template number followed by a space. The rest of
the line is the text of a template example sentence, with \fB%s\fP
used as a placeholder in the text for the verb. Both files are sorted
alphabetically so that the \fIsense_key\fP and template sentence
number can be used as indices, via
.BR binsrch (3WN),
into the appropriate file.
When a request for
.SB FRAMES
is made, the WordNet search code looks
for the sense in \fBsentidx.vrb\fP. If found, the sentence
template(s) listed is retrieved from \fBsents.vrb\fP, and the \fB%s\fP
is replaced with the verb. If the sense is not found, the applicable
generic sentence frame(s) listed in \fIframes\fP is displayed.
.SH NOTES
Information in the \fBdata.\fIpos\fR and \fBindex.\fIpos\fR files
represents all of the word senses and synsets in the WordNet database.
The \fIword\fP, \fIlex_id\fP, and \fIlex_filenum\fP fields together
uniquely identify each word sense in WordNet. These can be encoded in
a \fIsense_key\fP as described in
.BR senseidx (5WN).
Each synset in the database can be uniquely identified by combining
the \fIsynset_offset\fP for the synset with a code for the syntactic
category (since it is possible for synsets in different
\fBdata.\fIpos\fR files to have the same \fIsynset_offset\fP).
The WordNet system provide both command line and window-based browser
interfaces to the database. Both interfaces utilize a common library
of search and morphology code. The source code for the library and
interfaces is included in the WordNet package. See
.BR wnintro (3WN)
for an overview of the WordNet source code.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B index.\fIpos\fP
database index files
.TP 20
.B data.\fIpos\fP
database data files
.TP 20
.B *.vrb
files of sentences illustrating the use of verbs
.TP 20
.B \fIpos\fP.exc
morphology exception lists
.SH SEE ALSO
.BR grind (1WN),
.BR wn (1WN),
.BR wnb (1WN),
.BR wnintro (3WN),
.BR binsrch (3WN),
.BR wnintro (5WN),
.BR cntlist (5WN),
.BR lexnames (5WN),
.BR senseidx (5WN),
.BR wninput (5WN),
.BR morphy (7WN),
.BR wngloss (7WN),
.BR wngroups (7WN),
.BR wnstats (7WN).

View File

@ -1,291 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNGLOSS 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
wngloss \- glossary of terms used in WordNet system
.SH DESCRIPTION
The \fIWordNet Reference Manual\fP consists of Unix-style manual pages
divided into sections as follows:
.TS
center box ;
c | c
c | l.
\fBSection\fP \fBDescription\fP
_
1 WordNet User Commands
3 WordNet Library Functions
5 WordNet File Formats
7 Miscellaneous Information about WordNet
.TE
.SS System Description
The WordNet system consists of lexicographer files, code to convert
these files into a database, and search routines and interfaces that
display information from the database. The lexicographer files
organize nouns, verbs, adjectives and adverbs into groups of synonyms,
and describe relations between synonym groups.
.BR grind (1WN)
converts the lexicographer files into a database that encodes the
relations between the synonym groups. The different interfaces to the
WordNet database utilize a common library of search routines to
display these relations. Note that the lexicographer files and
.BR grind (1WN)
program are not generally distributed.
.SS Database Organization
Information in WordNet is organized around logical groupings called
synsets. Each synset consists of a list of synonymous words or
collocations (eg. \fB"fountain pen"\fP, \fB"take in"\fP), and pointers
that describe the relations between this synset and other synsets. A
word or collocation may appear in more than one synset, and in more
than one part of speech. The words in a synset are grouped
such that they are interchangeable in some context.
Two kinds of relations are represented by pointers: lexical and
semantic. Lexical relations hold between semantically related
word forms; semantic
relations hold between word meanings. These relations include (but
are not limited to) hypernymy/hyponymy (superordinate/subordinate),
antonymy, entailment, and meronymy/holonymy.
Nouns and verbs are organized into hierarchies based on the
hypernymy/hyponymy relation between synsets. Additional pointers are
be used to indicate other relations.
Adjectives are arranged in clusters containing head synsets and
satellite synsets. Each cluster is organized around antonymous pairs
(and occasionally antonymous triplets). The antonymous pairs (or
triplets) are indicated in the head synsets of a cluster. Most head
synsets have one or more satellite synsets, each of which represents a
concept that is similar in meaning to the concept represented by the
head synset. One way to think of the adjective cluster organization
is to visualize a wheel, with a head synset as the hub and satellite
synsets as the spokes. Two or more wheels are logically connected via
antonymy, which can be thought of as an axle between the wheels.
Pertainyms are relational adjectives and do not follow the structure
just described. Pertainyms do not have antonyms; the synset for a
pertainym most often contains only one word or collocation and a
lexical pointer to the noun that the adjective is "pertaining
to". Participial adjectives have lexical pointers to the verbs that
they are derived from.
Adverbs are often derived from adjectives, and sometimes have
antonyms; therefore the synset for an adverb usually contains a
lexical pointer to the adjective from which it is derived.
See
.BR wndb (5WN)
for a detailed description of the database files and how the data are
represented.
.SH GLOSSARY OF TERMS
Many terms used in the \fIWordNet Reference Manual\fP are unique to
the WordNet system. Other general terms have specific meanings when
used in the WordNet documentation. Definitions for many of these
terms are given to help with the interpretation and understanding of
the reference manual, and in the use of the WordNet system.
In following definitions \fBword\fP is used in place of \fBword or
collocation\fP.
.TP 25
.B adjective cluster
A group of adjective synsets that are organized around antonymous
pairs or triplets. An adjective cluster contains two or more \fBhead
synsets\fR which represent antonymous concepts.
Each head synset has one or more \fBsatellite synsets\fP.
.TP 25
.B attribute
A noun for which adjectives express values.
The noun \fBweight\fP is an attribute, for which the adjectives
\fBlight\fP and \fBheavy\fP express values.
.TP 25
.B base form
The base form of a word or collocation is the form to which
inflections are added.
.TP 25
.B basic synset
Syntactically, same as \fBsynset\fP. Term is used in
.BR wninput (5WN)
to help explain differences in entering synsets in lexicographer
files.
.TP 25
.B collocation
A collocation in WordNet is a string of two or more words, connected
by spaces or hyphens. Examples are: \fBman-eating~shark\fP,
\fBblue-collar\fP, \fBdepend~on\fP, \fBline~of~products\fP. In the
database files spaces are represented as underscore (\fB_\fP)
characters.
.TP 25
.B coordinate
Coordinate terms are nouns or verbs that have the same \fBhypernym\fP.
.TP 25
.B cross-cluster pointer
A \fBsemantic pointer\fP from one adjective cluster to another.
.TP 25
.B derivationally related forms
Terms in different
syntactic categories that have the same root form and are semantically
related.
.TP 25
.B direct antonyms
A pair of words between which there is an associative bond resulting
from their frequent
co-occurrence. In \fBadjective clusters\fP, direct antonyms appears
only in \fBhead synsets\fP.
.TP 25
.B domain
A topical classification to which a synset has been linked with a
CATEGORY, REGION or USAGE pointer.
.TP 25
.B domain term
A synset belonging to a topical class. A domain term is further
identified as being a CATEGORY_TERM, REGION_TERM or USAGE_TERM.
.TP 25
.B entailment
A verb \fBX\fP entails \fBY\fP if \fBX\fP cannot be done unless \fBY\fP is,
or has been, done.
.TP 25
.B exception list
Morphological transformations for words that are not regular and
therefore cannot be processed in an algorithmic manner.
.TP 25
.B group
Verb senses that similar in meaning and have been manually grouped
together.
.TP 25
.B gloss
Each synset contains \fBgloss\fP consisting of a definition and
optionally example sentences.
.TP 25
.B head synset
Synset in an adjective \fBcluster\fP containing at least one word
that has a \fBdirect antonym\fP.
.TP 25
.B holonym
The name of the whole of which the meronym names a part. \fBY\fP
is a holonym of \fBX\fP if \fBX\fP is a part of \fBY\fP.
.TP 25
.B hypernym
The generic term used to designate a whole class of specific instances.
\fBY\fP is a hypernym of \fBX\fP if \fBX\fP is a (kind of) \fBY\fP.
.TP 25
.B hyponym
The specific
term used to designate a member of a class. \fBX\fP is a hyponym of
\fBY\fP if \fBX\fP is a (kind of) \fBY\fP.
.TP 25
.B indirect antonym
An adjective in a \fBsatellite synset\fP that does not have a
\fBdirect antonym\fP
has an indirect antonyms via the direct antonym of the \fBhead
synset\fP.
.TP 25
.B instance
A proper noun that refers
to a particular, unique referent (as distinguished from nouns that
refer to classes). This is a specific form of hyponym.
.TP 25
.B lemma
Lower case ASCII text of word as found in the WordNet database index
files. Usually the \fBbase form\fP for a word or collocation.
.TP 25
.B lexical pointer
A lexical pointer indicates a relation between words in synsets (word
forms).
.TP
.B lexicographer file
Files containing the raw data for WordNet synsets, edited by lexicographers,
that are input to the \fBgrind\fP program to generate a WordNet database.
.TP
.B lexicographer id (lex id)
A decimal integer that, when appended onto \fBlemma\fP, uniquely
identifies a sense within a lexicographer file.
.TP
.B monosemous
Having only one sense in a syntactic category.
.TP 25
.B meronym
The name of a constituent part of, the substance of, or a member of
something. \fBX\fP is a meronym of \fBY\fP if \fBX\fP is a part of \fBY\fP.
.TP 25
.B part of speech
WordNet defines "part of speech" as either noun, verb, adjective, or
adverb. Same as \fBsyntactic category\fP.
.TP 25
.B participial adjective
An adjective that is derived from a verb.
.TP 25
.B pertainym
A relational adjective. Adjectives that are pertainyms are usually
defined by such phrases as "of or pertaining to" and do not have
antonyms. A pertainym can point to a noun or another pertainym.
.TP 25
.B polysemous
Having more than one sense in a syntactic category.
.TP 25
.B polysemy count
Number of senses of a word in a syntactic category, in WordNet.
.TP 25
.B postnominal
A postnominal adjective occurs only immediately following the noun
that it modifies.
.TP 25
.B predicative
An adjective that can be used only in predicate positions. If \fBX\fP
is a predicate adjective, it can only be used in such phrases as "it is
\fBX\fP" and never prenominally.
.TP 25
.B prenominal
An adjective that can occur only before the noun that it modifies: it
cannot be used predicatively.
.TP 25
.B satellite synset
Synset in an adjective \fBcluster\fP representing a concept that is
similar in meaning to the concept represented by its \fBhead
synset\fP.
.TP 25
.B semantic concordance
A textual corpus (e.g. the Brown Corpus) and a lexicon (e.g. WordNet)
so combined
that every substantive word in the text is linked to its appropriate
sense in the lexicon via a \fBsemantic tag\fP.
.TP 25
.B semantic tag
A pointer from a word in a text file to a specific sense of that word in the
WordNet database. A semantic tag in a semantic concordance is
represented by a \fBsense key\fP.
.TP 25
.B semantic pointer
A semantic pointer indicates a relation between synsets (concepts).
.TP 25
.B sense
A meaning of a word in WordNet. Each sense of a word is in a
different \fBsynset\fP.
.TP 25
.B sense key
Information necessary to find a sense in the WordNet database. A
sense key combines a \fBlemma\fP field and codes for the synset type,
lexicographer id, lexicographer file number, and information about a
satellite's \fBhead synset\fP, if required. See
.BR senseidx (5WN)
for a description of the format of a sense key.
.TP 25
.B subordinate
Same as \fBhyponym\fP.
.TP 25
.B superordinate
Same as \fBhypernym\fP.
.TP 25
.B synset
A synonym set; a set of words that are interchangeable in some
context without changing the truth value of the preposition in which
they are embedded.
.TP 25
.B troponym
A verb expressing a specific manner elaboration of another verb.
\fBX\fP is a troponym of \fBY\fP if \fBto X\fP is \fBto Y\fP in some manner.
.TP 25
.B unique beginner
A noun synset with no \fBsuperordinate\fP.

View File

@ -1,43 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNGROUPS 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
wngroups \- discussion of WordNet search code to group similar verb senses
.SH DESCRIPTION
Some similar senses of verbs have been grouped by the lexicographers.
This grouping is done statically in the lexicographer source files
using the semantic \fIpointer_symbol\fP \fB$\fP.
Transitivity is used to combine groups of overlapping
senses into the largest sense groups possible.
.SH NOTES
Coverage of verb groups is incomplete.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 20
.B sentidx.vrb
verb sense keys and sentence frame numbers
.TP 20
.B sents.vrb
example sentence frames
.SH SEE ALSO
.BR wn (1WN),
.BR wnb (1WN),
.BR senseidx (5WN),
.BR wnsearch (3WN),
.BR wndb (5WN),
.BR wnintro (7WN).

View File

@ -1,513 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNINPUT 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
noun.\fIsuffix\fP, verb.\fIsuffix\fP, adj.\fIsuffix\fP, adv.\fIsuffix\fP \-
WordNet lexicographer files that are input to
.BR grind (1WN)
.SH DESCRIPTION
WordNet's source files are written by lexicographers. They are the
product of a detailed relational analysis of lexical semantics: a
variety of lexical and semantic relations are used to represent the
organization of lexical knowledge. Two kinds of building blocks are
distinguished in the source files: word forms and word meanings. Word
forms are represented in their familiar orthography; word meanings are
represented by synonym sets (\fIsynset\fPs) \- lists of synonymous
word forms that are interchangeable in some context. Two kinds of
relations are recognized: lexical and semantic. Lexical relations
hold between word forms; semantic relations hold between word
meanings.
Lexicographer files correspond to the syntactic categories implemented
in WordNet \- noun, verb, adjective and adverb. All of the synsets in
a lexicographer file are in the same syntactic category. Each synset
consists of a list of synonymous words or collocations
(eg. \fB"fountain pen"\fP, \fB"take in"\fP), and pointers that
describe the relations between this synset and other synsets. These
relations include (but are not limited to) hypernymy/hyponymy,
antonymy, entailment, and meronymy/holonymy. A word or collocation
may appear in more than one synset, and in more than one part of
speech. Each use of a word in a synset represents a sense of that
word in the part of speech corresponding to the synset.
Adjectives may be organized into clusters containing head synsets and
satellite synsets. Adverbs generally point to the adjectives from
which they are derived.
See
.BR wngloss (7WN)
for a glossary of WordNet terminology and a discussion of the
database's content and logical organization.
.SS Lexicographer File Names
The names of the lexicographer files are of the form:
.RS
.IR pos . suffix
.RE
where \fIpos\fP is either \fBnoun\fP, \fBverb\fP, \fBadj\fP or
\fBadv\fP. \fIsuffix\fP may be used to organize groups of synsets
into different files, for example \fBnoun.animal\fP and
\fBnoun.plant\fP. See
.BR lexnames (5WN)
for a list of lexicographer file names that are used in building
WordNet.
.SS Pointers
Pointers are used to represent the relations between the words in one
synset and another. Semantic pointers represent relations between
word meanings, and therefore pertain to all of the words in the source
and target synsets. Lexical pointers represent relations between word
forms, and pertain only to specific words in the source and target
synsets. The following pointer types are usually used to indicate
lexical relations: Antonym, Pertainym, Participle, Also See, Derivationally
Related. The remaining pointer types are generally used to represent semantic
relations.
A relation from a source to a target synset is formed by specifying
a word from the target synset in the source synset, followed by the
\fIpointer_symbol\fP indicating the pointer type. The location of a pointer
within a synset defines it as either lexical or semantic.
The
.SB "Lexicographer File Format"
section describes the syntax for entering a semantic pointer, and
.SB "Word Syntax"
describes the syntax for entering a lexical pointer.
Although there are many pointer types, only certain types of relations
are permitted between synsets of each syntactic category.
The \fIpointer_symbol\fPs for nouns are:
.RS
.nf
\fB!\fP Antonym
\fB@\fP Hypernym
\fB@i\fP Instance Hypernym
\fB\(ap\fP Hyponym
\fB\(api\fP Instance Hyponym
\fB#m\fP Member holonym
\fB#s\fP Substance holonym
\fB#p\fP Part holonym
\fB%m\fP Member meronym
\fB%s\fP Substance meronym
\fB%p\fP Part meronym
\fB=\fP Attribute
\fB+\fP Derivationally related form
\fB;c\fP Domain of synset - TOPIC
\fB-c\fP Member of this domain - TOPIC
\fB;r\fP Domain of synset - REGION
\fB-r\fP Member of this domain - REGION
\fB;u\fP Domain of synset - USAGE
\fB-u\fP Member of this domain - USAGE
.RE
.fi
The \fIpointer_symbol\fPs for verbs are:
.RS
.nf
\fB!\fP Antonym
\fB@\fP Hypernym
\fB\(ap\fP Hyponym
\fB*\fP Entailment
\fB>\fP Cause
\fB^\fP Also see
\fB$\fP Verb Group
\fB+\fP Derivationally related form
\fB;c\fP Domain of synset - TOPIC
\fB;r\fP Domain of synset - REGION
\fB;u\fP Domain of synset - USAGE
.fi
.RE
The \fIpointer_symbol\fPs for adjectives are:
.RS
.nf
\fB!\fP Antonym
\fB&\fP Similar to
\fB<\fP Participle of verb
\fB\e\fP Pertainym (pertains to noun)
\fB=\fP Attribute
\fB^\fP Also see
\fB;c\fP Domain of synset - TOPIC
\fB;r\fP Domain of synset - REGION
\fB;u\fP Domain of synset - USAGE
.fi
.RE
The \fIpointer_symbol\fPs for adverbs are:
.RS
.nf
\fB!\fP Antonym
\fB\e\fP Derived from adjective
\fB;c\fP Domain of synset - TOPIC
\fB;r\fP Domain of synset - REGION
\fB;u\fP Domain of synset - USAGE
.fi
.RE
Many pointer types are reflexive, meaning that if a synset contains a
pointer to another synset, the other synset should contain a
corresponding reflexive pointer.
.BR grind (1WN)
automatically inserts missing reflexive pointers for the following
pointer types:
.TS
center box ;
c | c
l | l .
\fBPointer\fP \fBReflect\fP
_
Antonym Antonym
Hyponym Hypernym
Hypernym Hyponym
Instance Hyponym Instance Hypernym
Instance Hypernym Instance Hyponym
Holonym Meronym
Meronym Holonym
Similar to Similar to
Attribute Attribute
Verb Group Verb Group
Derivationally Related Derivationally Related
Domain of synset Member of Doman
.TE
.SS Verb Frames
Each verb synset contains a list of generic sentence frames
illustrating the types of simple sentences in which the verbs in the
synset can be used. For some verb senses, example sentences
illustrating actual uses of the verb are provided. (See
.SB "Verb Example Sentences"
in
.BR wndb (5WN).)
Whenever there is no example sentence, the generic sentence frames
specified by the lexicographer are used. The generic sentence frames
are entered in a synset as a comma-separated list of integer frame
numbers. The following list is the text of the generic frames,
preceded by their frame numbers:
.RS
.nf
1 Something ----s
2 Somebody ----s
3 It is ----ing
4 Something is ----ing PP
5 Something ----s something Adjective/Noun
6 Something ----s Adjective/Noun
7 Somebody ----s Adjective
8 Somebody ----s something
9 Somebody ----s somebody
10 Something ----s somebody
11 Something ----s something
12 Something ----s to somebody
13 Somebody ----s on something
14 Somebody ----s somebody something
15 Somebody ----s something to somebody
16 Somebody ----s something from somebody
17 Somebody ----s somebody with something
18 Somebody ----s somebody of something
19 Somebody ----s something on somebody
20 Somebody ----s somebody PP
21 Somebody ----s something PP
22 Somebody ----s PP
23 Somebody's (body part) ----s
24 Somebody ----s somebody to INFINITIVE
25 Somebody ----s somebody INFINITIVE
26 Somebody ----s that CLAUSE
27 Somebody ----s to somebody
28 Somebody ----s to INFINITIVE
29 Somebody ----s whether INFINITIVE
30 Somebody ----s somebody into V-ing something
31 Somebody ----s something with something
32 Somebody ----s INFINITIVE
33 Somebody ----s VERB-ing
34 It ----s that CLAUSE
35 Something ----s INFINITIVE
.fi
.RE
.SS Lexicographer File Format
Synsets are entered one per line, and each line is terminated with a
newline character. A line containing a synset may be as long as
necessary, but no newlines can be entered within a synset. Within a
synset, spaces or tabs may be used to separate entities. Items
enclosed in italicized square brackets may not be present.
The general synset syntax is:
.RS
.nf
\fB{\fP \fI~~words~~pointers~~\fP \fB(\fP \fI~gloss~\fP \fB)~~}\fR
.fi
.RE
Synsets of this form are valid for all syntactic categories except
verb, and are referred to as basic synsets. At least one \fIword\fP
and a \fIgloss\fP are required to form a valid synset. Pointers
entered following all the \fIwords\fP in a synset represent semantic
relations between all the words in the source and target synsets.
For verbs, the basic synset syntax is defined as follows:
.KS
.RS
.nf
\fB{\fP \fI~~words~~pointers~~frames~~\fP \fB(\fP ~\fIgloss~\fP \fB)~~}\fR
.fi
.RE
Adjective may be organized into clusters containing one or more head
synsets and optional satellite synsets. Adjective clusters are of the
form:
.RS
.nf
\fB[
\fIhead synset
[satellite synsets]
[\-]
[additional head/satellite synsets]
\fB]\fR
.fi
.RE
.KE
Each adjective cluster is enclosed in square brackets, and may have
one or more parts. Each part consists of a head synset and optional
satellite synsets that are conceptually similar to the head synset's
meaning. Parts of a cluster are separated by one or more hyphens
(\fB\-\fP) on a line by themselves, with the terminating square
bracket following the last synset. Head and satellite synsets follow
the syntax of basic synsets, however a "Similar to" pointer must be
specified in a head synset for each of its satellite synsets. Most
adjective clusters contain two antonymous parts. See
.BR wngloss (7WN)
for a discussion of adjective clusters, and
.SB "Special Adjective Syntax"
for more information on adjective cluster syntax.
Synsets for relational adjectives (pertainyms) and participial
adjectives do not adhere to the cluster structure. They use the basic
synset syntax.
Comments can be entered in a lexicographer file by enclosing the text
of the comment in parentheses. Note that comments \fBcannot\fP appear
within a synset, as parentheses within a synset have an entirely
different meaning (see
.SB "Gloss Syntax"
). However, entire synsets (or adjective clusters) can be "commented
out" by enclosing them in parentheses. This is often used by the
lexicographers to verify the syntax of files under development or to
leave a note to oneself while working on entries.
.SS Word Syntax
A synset must have at least one word, and the words of a synset must
appear after the opening brace and before any other synset constructs.
A word may be entered in either the simple word or word/pointer
syntax.
A simple word is of the form:
.RS
.nf
\fIword[\fP \fB(\fP \fImarker\fP \fB)\fP \fI][lex_id]\fP \fB,\fR
.fi
.RE
\fIword\fP may be entered in any combination of upper and lower case
unless it is in an adjective cluster. A collocation is entered by
joining the individual words with an underscore character (\fB_\fP).
Numbers (integer or real) may be entered, either by themselves or as
part of a word string, by following the number with a double quote
(\fB"\fP).
See
.SB "Special Adjective Syntax"
for a description of adjective clusters and markers.
\fIword\fP may be followed by an integer \fIlex_id\fP from \fB1\fP to
\fB15\fP. The \fIlex_id\fP is used to distinguish different senses of
the same word within a lexicographer file. The lexicographer assigns
\fIlex_id\fP values, usually in ascending order, although there is no
requirement that the numbers be consecutive. The default is \fB0\fP,
and does not have to be specified. A \fIlex_id\fP must be used on
pointers if the desired sense has a non-zero \fIlex_id\fP in its
synset specification.
Word/pointer syntax is of the form:
.RS
.nf
\fB[~~\fP \fIword[\fP \fB(\fP \fImarker\fP \fB)\fP \fI][lex_id]\fP \fB,\fP \fI~~pointers~~\fP \fB]\fR
.fi
.RE
This syntax is used when one or more pointers correspond only to the
specific word in the word/pointer set, rather than all the words in
the synset, and represents a lexical relation. Note that a
word/pointer set appears within a synset, therefore the square
brackets used to enclose it are treated differently from those used to
define an adjective cluster. Only one word can be specified in each
word/pointer set, and any number of pointers may be included. A
synset can have any number of word/pointer sets. Each is treated by
.BR grind (1WN)
essentially as a \fIword\fP, so they all must appear
before any synset \fIpointers\fP representing semantic relations.
For verbs, the word/pointer syntax is extended in the following manner
to allow the user to specify generic sentence frames that, like
pointers, correspond only to a specific word, rather than all the
words in the synset. In this case, \fIpointers\fP are optional.
.RS
.nf
\fB[~~\fP \fIword\fP \fB,\fP ~~\fI[pointers]~~frames~~\fP \fB]\fR
.fi
.RE
.SS Pointer Syntax
Pointers are optional in synsets. If a pointer is specified outside
of a word/pointer set, the relation is applied to all of the words in
the synset, including any words specified using the word/pointer
syntax. This indicates a semantic relation between the meanings of
the words in the synsets. If specified within a word/pointer set, the
relation corresponds only to the word in the set and represents a
lexical relation.
A pointer is of the form:
.RS
.nf
\fI[lex_filename\fP\fB:\fP \fI]word[lex_id]\fP\fB,\fP\fIpointer_symbol\fR
.fi
.RE
or:
.RS
.nf
\fI[lex_filename\fP\fB:\fP \fI]word[lex_id]\fP\fB^\fP\fIword[lex_id]\fP\fB,\fP\fIpointer_symbol\fR
.fi
.RE
For pointers, \fIword\fP indicates a word in another synset. When the
second form of a pointer is used, the first \fIword\fP indicates a
word in a head synset, and the second is a word in a satellite of that
cluster. \fIword\fP may be followed by a \fIlex_id\fP that is used to
match the pointer to the correct target synset. The synset containing
\fIword\fP may reside in another lexicographer file. In this case,
\fIword\fP is preceded by \fIlex_filename\fP as shown.
See
.SB "Pointers"
for a list of \fIpointer_symbol\fPs and their meanings.
.SS Verb Frame List Syntax
Frame numbers corresponding to generic sentence frames must be entered
in each verb synset. If a frame list is specified outside of a
word/pointer set, the verb frames in the list apply to all of the
words in the synset, including any words specified using the
word/pointer syntax. If specified within a word/pointer set, the verb
frames in the list correspond only to the word in the set.
A frame number list is entered as follows:
.RS
\fBframes:\fP~~\fIf_num\fP[\fB,\fP\fIf_num...]\fR
.RE
Where \fIf_num\fP specifies a generic frame number.
See
.SB "Verb Frames"
for a list of generic sentences and their corresponding frame numbers.
.SS Gloss Syntax
A gloss is included in all synsets. The lexicographer may enter a
text string of any length desired. A gloss is simply a string
enclosed in parentheses with no embedded carriage returns. It
provides a definition of what the synset represents and/or example
sentences.
.SS Special Adjective Syntax
The syntax for representing antonymous adjective synsets requires
several additional conditions.
The first word of a head synset \fBmust\fP be entered in upper case,
and can be thought of as the head word of the head synset. The
\fIword\fP part of a pointer from one head synset to another head
synset within the same cluster (usually an antonym) must also be
entered in upper case. Usually antonymous adjectives are entered
using the word/pointer syntax described in
.SB "Word Syntax"
to indicate a lexical relation. There is no restriction on the number
of parts that a cluster may have, and some clusters have three parts,
representing antonymous triplets, such as \fBsolid\fP, \fBliquid\fP,
and \fBgas\fP.
A cross-cluster pointer may be specified, allowing a head or satellite
synset to point to a head synset in a different cluster. A
cross-cluster pointer is indicated by entering the \fIword\fP part of
the pointer in upper case.
An adjective may be annotated with a syntactic marker indicating a
limitation on the syntactic position the adjective may have in
relation to noun that it modifies. If so marked, the marker appears
between the word and its following comma. If a \fIlex_id\fP is
specified, the marker immediately follows it. The syntactic markers
are:
.RS
.nf
\fB(p)\fP predicate position
\fB(a)\fP prenominal (attributive) position
\fB(ip)\fP immediately postnominal position
.fi
.RE
.SH EXAMPLES
\fI(Note that these are hypothetical examples not found in the WordNet
lexicographer files.)\fP
Sample noun synsets:
.RS
.nf
{ canine, [ dog1, cat,! ] pooch, canid,@ }
{ collie, dog1,@ (large multi-colored dog with pointy nose) }
{ hound, hunting_dog, pack,#m dog1,@ }
{ dog, }
.fi
.RE
Sample verb synsets:
.RS
.nf
{ [ confuse, clarify,! frames: 1 ] blur, obscure, frames: 8, 10 }
{ [ clarify, confuse,! ] make_clear, interpret,@ frames: 8 }
{ interpret, construe, understand,@ frames: 8 }
.fi
.RE
Sample adjective clusters:
.RS
.nf
[
{ [ HOT, COLD,! ] lukewarm(a), TEPID,^ (hot to the touch) }
{ warm, }
\-
{ [ COLD, HOT,! ] frigid, (cold to the touch) }
{ freezing, }
]
.fi
.RE
Sample adverb synsets:
.RS
.nf
{ [ basically, adj.all:essential^basic,\e ] [ essentially, adj.all:basic^fundamental,\e ] ( by one's very nature )}
{ pointedly, adj.all:pungent^pointed,\e }
{ [ badly, adj.all:bad,\e well,! ] ill, ("He was badly prepared") }
.fi
.RE
.SH SEE ALSO
.BR grind (1WN),
.BR wnintro (5WN),
.BR lexnames (5WN),
.BR wndb (5WN),
.BR uniqbeg (7WN),
.BR wngloss (7WN).
.LP
Fellbaum, C. (1998), ed.
\fI"WordNet: An Electronic Lexical Database"\fP.
MIT Press, Cambridge, MA.

View File

@ -1,53 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNINTRO 1WN "Dec 2006" "WordNet 3.0" "WordNet\(tm User Commands"
.SH NAME
wnintro \- WordNet user commands
.SH SYNOPSIS
.LP
\fBwn\fP \- command line interface to WordNet database
.LP
\fBwnb\fP \- window based WordNet browser
.SH DESCRIPTION
This section of the \fIWordNet Reference Manual\fP contains manual
pages that describe commands available with the various WordNet system
packages.
The WordNet interfaces
.BR wn (1WN)
and
.BR wnb (1WN)
allow the user to search the WordNet database and display the
information textually.
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH SEE ALSO
.BR grind (1WN),
.BR wn (1WN),
.BR wnb (1WN),
.BR wnintro (3WN),
.BR wnintro (5WN),
.BR wnintro (7WN).
.LP
Fellbaum, C. (1998), ed.
\fI"WordNet: An Electronic Lexical Database"\fP.
MIT Press, Cambridge, MA.
.SH AVAILABILITY
WordNet has a World Wide Web site at
\fBhttp://wordnet.princeton.edu\fP. From this web site
users can learn about the WordNet project, run several different
interfaces to the WordNet database, and download various WordNet
system packages and \fI"Five Papers on WordNet"\fP.

View File

@ -1,280 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNINTRO 3WN "Dec 2006" "WordNet 3.0" "WordNet\(tm Library Functions"
.SH NAME
wnintro \- introduction to WordNet library functions
.SH DESCRIPTION
This section of the \fIWordNet Reference Manual\fP contains manual
pages that describe the WordNet library functions and API.
Functions are organized into the following categories:
.TS
center box ;
l | l | l.
\fBCategory\fP \fBManual Page\fP \fBObject File\fP
_
Database Search wnsearch (3WN) search.o
Morphology morph (3WN) morph.o
Misc. Utility wnutil (3WN) wnutil.o
Binary Search binsrch (3WN) binsrch.o
.TE
The WordNet library is used by all of the searching interfaces
provided with the various WordNet packages. Additional programs in
the system, such as
.BR grind (1WN),
also use functions in this library.
The WordNet library is provided in both source and binary forms (on
some platforms) to allow users to build applications and tools to
their own specifications that utilize the WordNet database. We do not
provide programming support or assistance.
The code conforms to ANSI C standards. Functions are defined with
function prototypes. If you do not have a compiler that accepts
prototypes, you must edit the source code and remove the prototypes
before compiling.
.SH LIST OF WORDNET LIBRARY FUNCTIONS
Not all library functions are listed below. Missing are mainly
functions that are called by documented ones, or ones that were
written for specific applications or tools used during WordNet
development. Data structures are defined in
\fBwn.h\fP.
.SS Database Searching Functions (search.o)
.TP 25
.B findtheinfo
Primary search function for WordNet database. Returns
formatted search results in text buffer. Used by WordNet interfaces
to perform requested search.
.TP 25
.B findtheinfo_ds
Primary search function for WordNet database. Returns search results
in linked list data structure.
.TP 25
.B is_defined
Set bit for each search type that is valid for the search word passed
and return bit mask.
.TP 25
.B in_wn
Set bit for each syntactic category that search word is in.
.TP 25
.B index_lookup
Find word in index file and return parsed entry in data structure.
Input word must be exact match of string in database. Called by
\fBgetindex(\|)\fP.
.TP 25
.B getindex
Find word in index file, trying different techniques \- replace hyphens
with underscores, replace underscores with hyphens, strip hyphens and
underscores, strip periods.
.TP 25
.B read_synset
Read synset from data file at byte offset passed and return parsed
entry in data structure. Calls \fBparse_synset(\|)\fP.
.TP 25
.B parse_synset
Read synset at current byte offset in file and return parsed entry in
data structure.
.TP 25
.B free_syns
Free a synset linked list allocated by \fBfindtheinfo_ds(\|)\fP.
.TP 25
.B free_synset
Free a synset structure.
.TP 25
.B free_index
Free an index structure.
.TP 25
.B traceptrs_ds
Recursive search algorithm to trace a pointer tree and return results
in linked list.
.TP 25
.B do_trace
Do requested search on synset passed returning formatted output in
buffer.
.SS Morphology Functions (morph.o)
.TP 25
.B morphinit
Open exception list files.
.TP 25
.B re_morphinit
Close exception list files and reopen.
.TP 25
.B morphstr
Try to find base form (lemma) of word or collocation in syntactic
category passed. Calls \fBmorphword(\|)\fP for each word in string
passed.
.TP 25
.B morphword
Try to find base form (lemma) of individual word in syntactic category
passed.
.SS Utility Functions (wnutil.o)
.TP 25
.B wninit
Top level function to open database files and morphology exception
lists.
.TP 25
.B re_wninit
Top level function to close and reopen database files and morphology
exception lists.
.TP 25
.B cntwords
Count the number of underscore or space separated words in a string.
.TP 25
.B strtolower
Convert string to lower case and remove trailing adjective marker if
found.
.TP 25
.B ToLowerCase
Convert string passed to lower case.
.TP 25
.B strsubst
Replace all occurrences of \fIfrom\fP with \fIto\fP in \fIstr\fP.
.TP 25
.B getptrtype
Return code for pointer type character passed.
.TP 25
.B getpos
Return syntactic category code for string passed.
.TP 25
.B getsstype
Return synset type code for string passed.
.TP 25
.B FmtSynset
Reconstruct synset string from synset pointer.
.TP 25
.B StrToPos
Passed string for syntactic category, returns corresponding integer
value.
.TP 25
.B GetSynsetForSense
Return synset for sense key passed.
.TP 25
.B GetDataOffset
Find synset offset for sense.
.TP 25
.B GetPolyCount
Find polysemy count for sense passed.
.TP 25
.B GetWORD
Return word part of sense key.
.TP 25
.B GetPOS
Return syntactic category code for sense key passed.
.TP 25
.B WNSnsToStr
Generate sense key for index entry passed.
.TP 25
.B GetValidIndexPointer
Search for string and/or base form of word in database and return index
structure for word if found.
.TP 25
.B GetWNSense
Return sense number in database for sense key.
.TP 25
.B GetSenseIndex
Return parsed sense index entry for sense key passed.
.TP 25
.B default_display_message
Default function to use as value of \fBdisplay_message\fP. Simply
returns \fB-1\fP.
.SS Binary Search Functions (binsrch.o)
.TP 25
.B bin_search
General purpose binary search function to search for key as first item
on line in sorted file.
.TP 25
.B copyfile
Copy contents from one file to another.
.TP 25
.B replace_line
Replace a line in a sorted file.
.TP 25
.B insert_line
Insert a line into a sorted file.
.SH HEADER FILE
.TP 20
.B wn.h
WordNet include file of constants, data structures, external
declarations for global variables initialized in \fBwnglobal.c\fP.
Also lists function prototypes for library API. It must be included to
use any WordNet library functions.
.SH NOTES
All library functions that access the database files expect the files
to be open. The function
.BR wninit (3WN)
must be called before other database access functions such as
.BR findtheinfo (3WN)
or
.BR read_synset (3WN).
Inclusion of the header file \fBwn.h\fP is necessary.
The command line interface is a good example of a simple application
that uses several WordNet library functions.
Many of the library functions are passed or return syntactic category
or synset type information. The following table lists the possible
categories as integer codes, synset type constant names, syntactic
category constant names, single characters and character strings.
.TS
center box ;
c | c | c | c | c
c | l | l | c | l.
\fBInteger\fP \fBSynset Type\fP \fBSyntactic Category\fP \fBChar\fP \fBString\fP
_
1 NOUN NOUN n noun
2 VERB VERB v verb
3 ADJ ADJ a adj
4 ADV ADV r adv
5 SATELLITE ADJ s \fIn/a\fP
.TE
.SH ENVIRONMENT VARIABLES (UNIX)
.TP 20
.B WNHOME
Base directory for WordNet. Default is
\fB/usr/local/WordNet-3.0\fP.
.TP 20
.B WNSEARCHDIR
Directory in which the WordNet database has been installed.
Default is \fBWNHOME/dict\fP.
.SH REGISTRY (WINDOWS)
.TP 20
.B HKEY_LOCAL_MACHINE\eSOFTWARE\eWordNet\e3.0\eWNHome
Base directory for WordNet. Default is
\fBC:\eProgram~Files\eWordNet\e3.0\fP.
.SH FILES
.TP 30
.B lib/libwn.a
WordNet library (Unix)
.TP 30
.B lib\ewn.lib
WordNet library (Windows)
.TP 30
.B include
header files for use with WordNet library
.SH SEE ALSO
.BR wnintro (1WN),
.BR binsrch (3WN),
.BR morph (3WN),
.BR wnsearch (3WN),
.BR wnutil (3WN),
.BR wnintro (5WN),
.BR wnintro (7WN).
Fellbaum, C. (1998), ed.
\fI"WordNet: An Electronic Lexical Database"\fP.
MIT Press, Cambridge, MA.
.SH BUGS
Please report bugs to \fBwordnet@princeton.edu\fP.

View File

@ -1,54 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNINTRO 5WN "Dec 2006" "WordNet 3.0" "WordNet\(tm File Formats"
.SH NAME
wnintro \- introduction to descriptions of WordNet file formats
.SH SYNOPSIS
.LP
\fBcntlist\fP \- format of \fBcntlist\fP and \fBcntlist.rev\fP files
.LP
\fBlexnames\fP \- list of lexicographer file names and numbers
.LP
\fBprologdb\fP \- description of Prolog database files
.LP
\fBsenseidx\fP \- format of sense index file
.LP
\fBsensemap\fP \- mapping from senses in WordNet 2.1 to corresponding
3.0 senses
.LP
\fBwndb\fP \- format of WordNet database files
.LP
\fBwninput\fP \- format of WordNet lexicographer files
.SH DESCRIPTION
This section of the \fIWordNet Reference Manual\fP contains manual pages
that describe the formats of the various files included in different
WordNet 3.0 packages.
.SH NOMENCLATURE
All files are in ASCII. Fields are generally separated by one space,
unless otherwise noted, and each line is terminated with a newline
character. In the file format descriptions, terms in \fIitalics\fP
refer to field names. Characters or strings in \fBboldface\fP
represent an actual character or string as it appears in the file.
Items enclosed in italicized square brackets (\fI[~~]\fP) may not be present.
Since several files contain fields that have the identical meaning,
field names are consistently defined. For example, several WordNet
files contain one or more \fIsynset_offset\fP fields. In each case,
the definition of \fIsynset_offset\fP is identical.
.SH SEE ALSO
.BR wnintro (1WN),
.BR wnintro (3WN),
.BR cntlist (5WN),
.BR lexnames (5WN),
.BR prologdb (5WN),
.BR senseidx (5WN),
.BR sensemap (5WN),
.BR wndb (5WN),
.BR wninput (5WN),
.BR wnintro (7WN),
.BR wngloss (7WN).
.LP
Fellbaum, C. (1998), ed.
\fI"WordNet: An Electronic Lexical Database"\fP.
MIT Press, Cambridge, MA.

View File

@ -1,40 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNINTRO 7WN "Dec 2006" "WordNet 3.0" "Miscellaneous WordNet\(tm Topics"
.SH NAME
wnintro \- introduction to miscellaneous WordNet information
.SH SYNOPSIS
.LP
\fBmorphy\fP \- discussion of WordNet's morphological processing
.LP
\fBuniqbeg\fP \- unique beginners for noun hierarchies
.LP
\fBwngloss\fP \- glossary of terms used in WordNet
.LP
\fBwngroups\fP \- discussion of WordNet search code to group similar senses
.LP
\fBwnlicens\fP \- text of WordNet license agreement
.LP
\fBwnpkgs\fP \- information about WordNet packages and distribution
.LP
\fBwnstats\fP \- database statistics
.SH DESCRIPTION
This section of the \fIWordNet Reference Manual\fP contains manual pages
that describe various topics related to WordNet and the semantic
concordances, and a glossary of terms.
.SH SEE ALSO
.BR wnintro (1WN),
.BR wnintro (3WN),
.BR wnintro (5WN),
.BR morphy (7WN),
.BR uniqbeg (7WN),
.BR wngroups (7WN),
.BR wnlicens (7WN),
.BR wnpkgs (7WN),
.BR wnstats (7WN),
.BR wngloss (7WN).
.LP
Fellbaum, C. (1998), ed.
\fI"WordNet: An Electronic Lexical Database"\fP.
MIT Press, Cambridge, MA.

View File

@ -1,37 +0,0 @@
'\" t
.\" $Id$
.TH WNLICENS 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
wnlicens \- text of WordNet license
.SH DESCRIPTION
WordNet Release 3.0
This software and database is being provided to you, the LICENSEE, by
Princeton University under the following license. By obtaining, using
and/or copying this software and database, you agree that you have
read, understood, and will comply with these terms and conditions.:
Permission to use, copy, modify and distribute this software and
database and its documentation for any purpose and without fee or
royalty is hereby granted, provided that you agree to comply with
the following copyright notice and statements, including the disclaimer,
and that the same appear on ALL copies of the software, database and
documentation, including modifications that you make for internal
use or for distribution.
WordNet 3.0 Copyright 2006 by Princeton University. All rights reserved.
THIS SOFTWARE AND DATABASE IS PROVIDED "AS IS" AND PRINCETON
UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PRINCETON
UNIVERSITY MAKES NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-
ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE
OF THE LICENSED SOFTWARE, DATABASE OR DOCUMENTATION WILL NOT
INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR
OTHER RIGHTS.
The name of Princeton University or Princeton may not be used in
advertising or publicity pertaining to distribution of the software
and/or database. Title to copyright in this software, database and
any associated documentation shall at all times remain with
Princeton University and LICENSEE agrees to preserve same.

View File

@ -1,77 +0,0 @@
'\" t
.\" $Id$
.tr ~
.TH WNPKGS 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
wnpkgs \- description of various WordNet system packages
.SH DESCRIPTION
WordNet 3.0 is distributed in several formats and in various packages.
All of the packages are available via anonymous FTP from
\fBftp.cogsci.princeton.edu\fP and from the WordNet Web
site at \fBhttp://wordnet.princeton.edu\fP.
.SS "Packages Available Via FTP and WWW"
The following WordNet packages can be downloaded using a web browser
from \fBftp://ftp.cogsci.princeton.edu/wordnet/3.0\fP, or
from the Web site noted above. Users can also FTP directly from
\fBftp.cogsci.princeton.edu\fP, directory \fBwordnet/3.0\fP.
.TS
center box ;
c | c | c | c
lt | l | l | lt.
\fBPackage\fP \fBFilename\fP \fBPlatform\fP \fBDescription\fP
_
.na
Database \fBWordNet-3.0.tar.gz\fP Unix/OS X T{
WordNet 3.0 database, interfaces, sense index, interface
and library source code, documentation.
T}
Database \fBWordNet-3.0.exe\fP Windows T{
WordNet 3.0 database, interfaces, sense index, interface
and library source code, documentation.
T}
Prolog Database \fBWNprolog-3.0.tar.gz\fP All T{
WordNet 3.0 database files in Prolog-readable format, documentation.
T}
Sense Map \fBWNsnsmap-3.0.tar.gz\fP All T{
Mapping of 2.1 to 3.0 senses, documentation.
T}
.TE
.SS "Database Package"
The database package is a complete installation for WordNet 3.0 users.
It includes the 3.0 database files, source code for the WordNet browsers and
library, and documentation. The other packages are not included \-
they must be downloaded and installed separately.
Note that with this version of WordNet for Unix platforms, only source
code is provided. Users should carefully read the README and INSTALL
files for detailed information on compiling WordNet and dependencies.
.SS Prolog Database Package
The WordNet 3.0 database files are available in this package in a
Prolog-readable format. Documentation describing the file format is
included. This package is only downloadable in compressed tar file
format, although once unpackaged it can be used from Windows
systems since the files are in ASCII. Many Windows utilities, such as
WinZip, can deal with a
compressed tar file.
.SS Sense Map Package
To help users automatically convert 2.1 noun and verb senses to their
corresponding 3.0 senses, we provide sense mapping information in
this package. This package contains files to map polysemous and
monosemous words, and documentation that describes the format of these
files. As with the Prolog database, this package is only downloadable
in compressed tar format, but the files are also in ASCII.
.SH NOTES
The lexicographer files and
.BR grind (1WN)
program are not generally distributed.
All of the packages described above may not be available at the time
of release of the 3.0 database package.
.SH SEE ALSO
.BR wnintro (1WN),
.BR wnintro (3WN),
.BR wnintro (5WN),
.BR wnintro (7WN).

View File

@ -1,343 +0,0 @@
'\" t
.\" $Id$
.TH WNSEARCH 3WN "Dec 2006" "WordNet 3.0" "WordNet\(tm Library Functions"
.SH NAME
findtheinfo, findtheinfo_ds, is_defined, in_wn, index_lookup, parse_index, getindex, read_synset, parse_synset, free_syns, free_synset, free_index, traceptrs_ds, do_trace
.SH SYNOPSIS
.LP
\fB#include "wn.h"
.LP
\fBchar *findtheinfo(char *searchstr, int pos, int ptr_type, int sense_num);\fP
.LP
\fBSynsetPtr findtheinfo_ds(char *searchstr, int pos, int ptr_type, int sense_num );\fP
.LP
\fBunsigned int is_defined(char *searchstr, int pos);\fP
.LP
\fBunsigned int in_wn(char *searchstr, int pos);\fP
.LP
\fBIndexPtr index_lookup(char *searchstr, int pos);\fP
.LP
\fBIndexPtr parse_index(long offset, int dabase, char *line);\fP
.LP
\fBIndexPtr getindex(char *searchstr, int pos);\fP
.LP
\fBSynsetPtr read_synset(int pos, long synset_offset, char *searchstr);\fP
.LP
\fBSynsetPtr parse_synset(FILE *fp, int pos, char *searchstr);\fP
.LP
\fBvoid free_syns(SynsetPtr synptr);\fP
.LP
\fBvoid free_synset(SynsetPtr synptr);\fP
.LP
\fBvoid free_index(IndexPtr idx);\fP
.LP
\fBSynsetPtr traceptrs_ds(SynsetPtr synptr, int ptr_type, int pos, int depth);\fP
.LP
\fBchar *do_trace(SynsetPtr synptr, int ptr_type, int pos, int depth);\fP
.SH DESCRIPTION
.LP
These functions are used for searching the WordNet database. They
generally fall into several categories: functions for reading and
parsing index file entries; functions for reading and parsing synsets
in data files; functions for tracing pointers and hierarchies;
functions for freeing space occupied by data structures allocated with
.BR malloc (3).
In the following function descriptions, \fIpos\fP is one of the
following:
.RS
.nf
\fB1\fP NOUN
\fB2\fP VERB
\fB3\fP ADJECTIVE
\fB4\fP ADVERB
.fi
.RE
.B findtheinfo(\|)
is the primary search algorithm for use with database interface
applications. Search results are automatically formatted, and a
pointer to the text buffer is returned. All searches listed in
.B WNHOME/include/wn.h
can be done by
.BR findtheinfo(\|) .
.B findtheinfo_ds(\|)
can be used to perform most of the searches, with results returned in
a linked list data structure. This is for use with applications that
need to analyze the search results rather than just display them.
Both functions are passed the same arguments: \fIsearchstr\fP is the
word or collocation to search for; \fIpos\fP indicates the syntactic
category to search in; \fIptr_type\fP is one of the valid search types
for \fIsearchstr\fP in \fIpos\fP. (Available searches can be obtained
by calling
.B is_defined(\|)
described below.) \fIsense_num\fP should be
.SB ALLSENSES
if the search is to be done on all senses of \fIsearchstr\fP in
\fIpos\fP, or a positive integer indicating which sense to search.
\fBfindtheinfo_ds(\|)\fP returns a linked list data structures
representing synsets. Senses are linked through the \fInextss\fP
field of a \fBSynset\fP data structure. For each sense, synsets that
match the search specified with \fIptr_type\fP are linked through the
\fIptrlist\fP field. See
.SB "Synset Navigation",
below, for detailed information on the linked lists returned.
\fBis_defined(\|)\fP sets a bit for each search type that is valid for
\fIsearchstr\fP in \fIpos\fP, and returns the resulting unsigned
integer. Each bit number corresponds to a pointer type constant
defined in \fBWNHOME/include/wn.h\fP. For example, if bit 2 is
set, the
.SB HYPERPTR
search is valid for \fIsearchstr\fP. There are 29 possible searches.
\fBin_wn(\|)\fP is used to find the syntactic categories in the
WordNet database that contain one or more senses of \fIsearchstr\fP.
If \fIpos\fP is
.SB ALL_POS,
all syntactic categories are checked. Otherwise, only the part of
speech passed is checked. An unsigned integer is returned with a bit
set corresponding to each syntactic category containing
\fIsearchstr\fP. The bit number matches the number for the part of
speech. \fB0\fP is returned if \fIsearchstr\fP is not present in
\fIpos\fP.
\fBindex_lookup(\|)\fP finds \fIsearchstr\fP in the index file for
\fIpos\fP and returns a pointer to the parsed entry in an \fBIndex\fP
data structure. \fIsearchstr\fP must exactly match the form of the
word (lower case only, hyphens and underscores in the same places) in
the index file.
.SB NULL
is returned if a match is not found.
\fBparse_index(\|)\fP parses an entry from an index file and returns a
pointer to the parsed entry in an \fBIndex\fP data structure.
Passed the byte \fIoffset\fP and syntactic category, it reads the index
entry at the desired location in the corresponding file. If passed
\fIline\fP, \fIline\fP contains an index file entry and the database
index file is not consulted. However, \fIoffset\fP and \fIdbase\fP
should still be passed so the information can be stored in the
\fBIndex\fP structure.
\fBgetindex(\|)\fP is a "smart" search for \fIsearchstr\fP in the
index file corresponding to \fIpos\fP. It applies to \fIsearchstr\fP
an algorithm that replaces underscores with hyphens, hyphens with
underscores, removes hyphens and underscores, and removes periods in
an attempt to find a form of the string that is an exact match for an
entry in the index file corresponding to \fIpos\fP.
\fBindex_lookup(\|)\fP is called on each transformed string until a
match is found or all the different strings have been tried. It
returns a pointer to the parsed \fBIndex\fP data structure for
\fIsearchstr\fP, or
.SB NULL
if a match is not found.
\fBread_synset(\|)\fP is used to read a synset from a byte offset in a
data file. It performs an \fBfseek\fP(3) to \fIsynset_offset\fP in
the data file corresponding to \fIpos\fP, and calls
\fBparse_synset(\|)\fP to read and parse the synset. A pointer to the
\fBSynset\fP data structure containing the parsed synset is returned.
\fBparse_synset(\|)\fP reads the synset at the current offset in the
file indicated by \fIfp\fP. \fIpos\fP is the syntactic category, and
\fIsearchstr\fP, if not
.SB NULL,
indicates the word in the synset that the caller is interested in. An
attempt is made to match \fIsearchstr\fP to one of the words in the
synset. If an exact match is found, the \fIwhichword\fP field in the
\fBSynset\fP structure is set to that word's number in the synset
(beginning to count from \fB1\fP).
\fBfree_syns(\|)\fP is used to free a linked list of \fBSynset\fP
structures allocated by \fBfindtheinfo_ds(\|)\fP. \fIsynptr\fP is a
pointer to the list to free.
\fBfree_synset(\|)\fP frees the \fBSynset\fP structure pointed to by
\fIsynptr\fP.
\fBfree_index(\|)\fP frees the \fBIndex\fP structure pointed to by
\fIidx\fP.
\fBtraceptrs_ds(\|)\fP is a recursive search algorithm that traces
pointers matching \fIptr_type\fP starting with the synset pointed to
by \fIsynptr\fP. Setting \fIdepth\fP to \fB1\fP when
\fBtraceptrs_ds(\|)\fP is called indicates a recursive search; \fB0\fP
indicates a non-recursive call. \fIsynptr\fP points to the data
structure representing the synset to search for a pointer of type
\fIptr_type\fP. When a pointer type match is found, the synset
pointed to is read is linked onto the \fInextss\fP chain. Levels of
the tree generated by a recursive search are linked via the
\fIptrlist\fP field structure until
.SB NULL
is found, indicating the top (or bottom) of the tree. This function
is usually called from \fBfindtheinfo_ds(\|)\fP for each sense of the
word. See
.SB "Synset Navigation",
below, for detailed information on the linked lists returned.
\fBdo_trace(\|)\fP performs the search indicated by \fIptr_type\fP on
synset \fPsynptr\fP in syntactic category \fIpos\fP. \fIdepth\fP is
defined as above. \fBdo_trace(\|)\fP returns the search results
formatted in a text buffer.
.SS Synset Navigation
Since the \fBSynset\fP structure is used to represent the synsets for
both word senses and pointers, the \fIptrlist\fP and \fInextss\fP
fields have different meanings depending on whether the structure is a
word sense or pointer. This can make navigation through the lists
returned by \fBfindtheinfo_ds(\|)\fP confusing.
Navigation through the returned list involves the following:
Following the \fInextss\fP chain from the synset returned moves
through the various senses of \fIsearchstr\fP.
.SB NULL
indicates that end of the chain of senses.
Following the \fIptrlist\fP chain from a \fBSynset\fP structure
representing a sense traces the hierarchy of the search results for
that sense. Subsequent links in the \fIptrlist\fP chain indicate the
next level (up or down, depending on the search) in the hierarchy.
.SB NULL
indicates the end of the chain of search result synsets.
If a synset pointed to by \fIptrlist\fP has a value in the
\fInextss\fP field, it represents another pointer of the same type at
that level in the hierarchy. For example, some noun synsets have two
hypernyms. Following this \fInextss\fP pointer, and then the
\fIptrlist\fP chain from the \fBSynset\fP structure pointed to, traces
another, parallel, hierarchy, until the end is indicated by
.SB NULL
on that \fIptrlist\fP chain. So, a \fBsynset\fP representing a
pointer (versus a sense of \fIsearchstr\fP) having a non-NULL
value in \fInextss\fP has another chain of search results linked
through the \fIptrlist\fP chain of the synset pointed to by
\fInextss\fP.
If \fIsearchstr\fP contains more than one base form in WordNet (as in
the noun \fBaxes\fP, which has base forms \fBaxe\fP and \fBaxis\fP),
synsets representing the search results for each base form are linked
through the \fInextform\fP pointer of the \fBSynset\fP structure.
.SS WordNet Searches
There is no extensive description of what each search type is or the
results returned. Using the WordNet interface, examining the source
code, and reading
.BR wndb (5WN)
are the best ways to see what types of searches are available and the
data returned for each.
Listed below are the valid searches
that can be passed as \fIptr_type\fP
to \fBfindtheinfo(\|)\fP. Passing a negative value (when applicable)
causes a recursive, hierarchical search by setting \fIdepth\fP to
\fB1\fP when \fBtraceptrs(\|)\fP is called.
.bp
.TS
center box ;
l | c | c | l
l | c | c | l
l | c | c | l .
\fBptr_type\fP \fBValue\fP \fBPointer\fP \fBSearch\fP
\fBSymbol\fP
_
ANTPTR 1 ! Antonyms
HYPERPTR 2 @ Hypernyms
HYPOPTR 3 \(ap Hyponyms
ENTAILPTR 4 * Entailment
SIMPTR 5 & Similar
ISMEMBERPTR 6 #m Member meronym
ISSTUFFPTR 7 #s Substance meronym
ISPARTPTR 8 #p Part meronym
HASMEMBERPTR 9 %m Member holonym
HASSTUFFPTR 10 %s Substance holonym
HASPARTPTR 11 %p Part holonym
MERONYM 12 % All meronyms
HOLONYM 13 # All holonyms
CAUSETO 14 > Cause
PPLPTR 15 < Participle of verb
SEEALSOPTR 16 ^ Also see
PERTPTR 17 \e Pertains to noun or derived from adjective
ATTRIBUTE 18 \\= Attribute
VERBGROUP 19 $ Verb group
DERIVATION 20 + Derivationally related form
CLASSIFICATION 21 ; Domain of synset
CLASS 22 - Member of this domain
SYNS 23 \fIn/a\fP Find synonyms
FREQ 24 \fIn/a\fP Polysemy
FRAMES 25 \fIn/a\fP Verb example sentences and generic frames
COORDS 26 \fIn/a\fP Noun coordinates
RELATIVES 27 \fIn/a\fP Group related senses
HMERONYM 28 \fIn/a\fP Hierarchical meronym search
HHOLONYM 29 \fIn/a\fP Hierarchical holonym search
WNGREP 30 \fIn/a\fP Find keywords by substring
OVERVIEW 31 \fIn/a\fP Show all synsets for word
CLASSIF_CATEGORY 32 ;c Show domain topic
CLASSIF_USAGE 33 ;u Show domain usage
CLASSIF_REGIONAL 34 ;r Show domain region
CLASS_CATEGORY 35 -c Show domain terms for topic
CLASS_USAGE 36 -u Show domain terms for usage
CLASS_REGIONAL 37 -r Show domain terms for region
INSTANCE 38 @i Instance of
INSTANCES 39 \(api Show instances
.TE
\fBfindtheinfo_ds(\|)\fP cannot perform the following searches:
.RS
.nf
SEEALSOPTR
PERTPTR
VERBGROUP
FREQ
FRAMES
RELATIVES
WNGREP
OVERVIEW
.fi
.RE
.SH NOTES
Applications that use WordNet and/or the morphological functions
must call \fBwninit(\|)\fP at the start of the program. See
.BR wnutil (3WN)
for more information.
In all function calls, \fIsearchstr\fP may be either a word or a
collocation formed by joining individual words with underscore
characters (\fB_\fP).
The \fBSearchResults\fP structure defines fields in the
\fIwnresults\fP global variable that are set by the various search
functions. This is a way to get additional information, such as the
number of senses the word has, from the search functions.
The \fIsearchds\fP field is set by \fBfindtheinfo_ds(\|)\fP.
The \fIpos\fP passed to \fBtraceptrs_ds(\|)\fP is not used.
.SH SEE ALSO
.BR wn (1WN),
.BR wnb (1WN),
.BR wnintro (3WN),
.BR binsrch (3WN),
.BR malloc (3),
.BR morph (3WN),
.BR wnutil (3WN),
.BR wnintro (5WN).
.SH WARNINGS
\fBparse_synset(\|)\fP must find an exact match between the
\fIsearchstr\fP passed and a word in the synset to set
\fIwhichword\fP. No attempt is made to translate hyphens and
underscores, as is done in \fBgetindex(\|)\fP.
The WordNet database and exception list files must be opened with
\fBwninit\fP prior to using any of the searching functions.
A large search may cause \fBfindtheinfo(\|)\fP to run out of buffer
space. The maximum buffer size is determined by computer platform.
If the buffer size is exceeded the following message is printed in the
output buffer: \fB"Search too large. Narrow search and try
again..."\fP.
Passing an invalid \fIpos\fP will probably result in a core dump.

View File

@ -1,65 +0,0 @@
'\" t
.\" $Id$
.TH WNSTATS 7WN "Dec 2006" "WordNet 3.0" "WordNet\(tm"
.SH NAME
wnstats \- WordNet 3.0 database statistics
.SH DESCRIPTION
.SS Number of words, synsets, and senses
.TS
center box tab(/);
c | c | c | c
c | c | c | c
l | r | r | r.
\fBPOS\fP/\fBUnique\fP/\fBSynsets\fP/\fBTotal\fP
/\fBStrings\fP//\fBWord-Sense Pairs\fP/
_
Noun/117798/82115/146312
Verb/11529/13767/25047
Adjective/21479/18156/30002
Adverb/4481/3621/5580
=
Totals/155287/117659/206941
.TE
.SS Polysemy information
.TS
center box tab(/);
c | c | c | c
c | c | c | c
l | r | r | r.
\fBPOS\fP/\fBMonosemous\fP/\fBPolysemous\fP/\fBPolysemous\fP
/\fBWords and Senses\fP/\fBWords\fP/\fBSenses\fP
_
Noun/101863/15935/44449
Verb/6277/5252/18770
Adjective/16503/4976/14399
Adverb/3748/733/1832
=
Totals/128391/26896/79450
.TE
.TS
center box tab(/);
c | c | c
c | c | c
l | r | r.
\fBPOS\fP/\fBAverage Polysemy\fP/\fBAverage Polysemy\fP
/\fBIncluding Monosemous Words\fP/\fBExcluding Monosemous Words\fP
_
Noun/1.24/2.79
Verb/2.17/3.57
Adjective/1.40/2.71
Adverb/1.25/2.50
.TE
.SH NOTES
Statistics for all types of adjectives and adjective satellites are
combined.
The total of all unique noun, verb, adjective, and adverb strings is
actually 147278.
However, many strings are unique within a
syntactic category, but are in more than one syntactic category. The
figures in the table represent the unique strings in each syntactic category.

View File

@ -1,177 +0,0 @@
'\" t
.\" $Id$
.TH WNUTIL 3WN "Dec 2006" "WordNet 3.0" "WordNet\(tm Library Functions"
.SH NAME
wninit, re_wninit, cntwords, strtolower, ToLowerCase, strsubst,
getptrtype, getpos, getsstype, StrToPos, GetSynsetForSense,
GetDataOffset, GetPolyCount, WNSnsToStr,
GetValidIndexPointer, GetWNSense, GetSenseIndex, default_display_message
.SH SYNOPSIS
.LP
\fB#include "wn.h"\fP
.LP
\fBint wninit(void);\fP
.LP
\fBint re_wninit(void);\fP
.LP
\fBint cntwords(char *str, char separator);\fP
.LP
\fBchar *strtolower(char *str);\fP
.LP
\fBchar *ToLowerCase(char *str);\fP
.LP
\fBchar *strsubst(char *str, char from, char to);\fP
.LP
\fBint getptrtype(char *ptr_symbol);\fP
.LP
\fBint getpos(char *ss_type);\fP
.LP
\fBint getsstype(char *ss_type);\fP
.LP
\fBint StrToPos(char \**pos);\fP
.LP
\fBSynsetPtr GetSynsetForSense(char *sense_key);\fP
.LP
\fBlong GetDataOffset(char *sense_key);\fP
.LP
\fBint GetPolyCount(char *sense_key);\fP
.LP
\fBchar *WNSnsToStr(IndexPtr idx, int sense_num);\fP
.LP
\fBIndexPtr GetValidIndexPointer(char *str, int pos);\fP
.LP
\fBint GetWNSense(char *lemma, *lex_sense);\fP
.LP
\fBSnsIndexPtr GetSenseIndex(char *sense_key);\fP
.LP
\fBint GetTagcnt(IndexPtr idx, int sense);\fP
.LP
\fBint default_display_message(char *msg);\fP
.SH DESCRIPTION
.LP
The WordNet library contains many utility functions used by the
interface code, other library functions, and various applications and
tools. Only those of importance to the WordNet search code, or which
are generally useful are described here.
.B wninit(\|)
opens the files necessary for using WordNet with the WordNet library
functions. The database files are opened, and
.B morphinit(\|)
is called to open the exception list files. Returns \fB0\fP if
successful, \fB-1\fP otherwise. The database and exception list files
must be open before the WordNet search and morphology functions are
used. If the database is successfully opened, the global variable
\fBOpenDB\fP is set to \fB1\fP. Note that it is possible for the
database files to be opened (\fBOpenDB == 1\fP), but not the exception
list files.
.B re_wninit(\|)
is used to close the database files and reopen them, and is used
exclusively for WordNet development.
.B re_morphinit(\|)
is called to close and reopen the exception list files. Return codes
are as described above.
.B cntwords(\|)
counts the number of underscore or space separated words in \fIstr\fP.
A hyphen is passed in \fIseparator\fP if is is to be considered a
word delimiter. Otherwise \fIseparator\fP can be any other
character, or an underscore if another character is not desired.
.B strtolower(\|)
converts \fIstr\fP to lower case and removes a trailing adjective
marker, if present. \fIstr\fP is actually modified by this function,
and a pointer to the modified string is returned.
.B ToLowerCase(\|)
converts \fIstr\fP to lower case as above, without removing an
adjective marker.
.B strsubst(\|)
replaces all occurrences of \fIfrom\fP with \fIto\fP in \fIstr\fP and
returns resulting string.
.B getptrtype(\|)
returns the integer \fIptr_type\fP corresponding to the pointer
character passed in \fIptr_symbol\fP. See
.BR wnsearch (3WN)
for a table of pointer symbols and types.
.B getpos(\|)
returns the integer constant corresponding to the synset type passed.
\fIss_type\fP may be one of the following: \fBn, v, a, r, s\fP. If
\fBs\fP is passed,
.SB ADJ
is returned. Exits with \fB-1\fP if \fIss_type\fP is invalid.
.B getsstype(\|)
works like \fBgetpos(\|)\fP, but returns
.SB SATELLITE
if \fIss_type\fP is \fBs\fP.
.B StrToPos(\|)
returns the integer constant corresponding to the syntactic category
passed in \fIpos\fP. \fIstring\fP must be one of the following:
\fBnoun, verb, adj, adv\fP. \fB-1\fP is returned if \fIpos\fP is
invalid.
.B GetSynsetForSense(\|)
returns the synset that contains the word sense \fIsense_key\fP and
.SB NULL
in case of error.
.B GetDataOffset(\|)
returns the synset offset for synset that contains the word sense
\fIsense_key\fP, and \fB0\fP if \fIsense_key\fP is not in sense index
file.
.B GetPolyCount(\|)
returns the polysemy count (number of senses in WordNet) for
\fIlemma\fP encoded in \fIsense_key\fP and \fB0\fP if word is not
found.
.B WNSnsToStr(\|)
returns sense key encoding for \fIsense_num\fP entry in \fIidx\fP.
.B GetValidIndexPointer(\|)
returns the Index structure for \fIword\fP in \fIpos\fP. Calls
.BR morphstr (3WN)
to find a valid base form if \fIword\fP is inflected.
.B GetWNSense(\|)
returns the WordNet sense number for the sense key encoding
represented by \fIlemma\fP and \fIlex_sense\fP.
.B GetSenseIndex(\|)
returns parsed sense index entry for \fIsense_key\fP and
.SB NULL
if \fIsense_key\fP is not in sense index.
.B GetTagcnt(\|)
returns the number of times the sense passed has been tagged according
to the \fIcntlist\fP file.
.B default_display_message(\|)
simply returns \fB-1\fP. This is the default value for the global
variable \fBdisplay_message\fP, that points to a function to call to
display an error message. In general, applications (including the
WordNet interfaces) define an application specific function and set
\fBdisplay_message\fP to point to it.
.SH NOTES
\fBinclude/wn.h\fP lists all the pointer and search
types and their corresponding constant values. There is no
description of what each search type is or the results returned.
Using the WordNet interface is the best way to see what types of
searches are available, and the data returned for each.
.SH SEE ALSO
.BR wnintro (3WN),
.BR wnsearch (3WN),
.BR morph (3WN),
.BR wnintro (5WN),
.BR wnintro (7WN).
.SH WARNINGS
Error checking on passed arguments is not rigorous. Passing
.SB NULL
pointers or invalid values will often cause an application to die.

View File

@ -1,313 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# doc/pdf/Makefile. Generated from Makefile.in by configure.
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
srcdir = .
top_srcdir = ../..
pkgdatadir = $(datadir)/WordNet
pkglibdir = $(libdir)/WordNet
pkgincludedir = $(includedir)/WordNet
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = /usr/csl/bin/install -c
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/pdf
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
$(srcdir)/*) f=`echo "$$p" | sed "s|^$$srcdirstrip/||"`;; \
*) f=$$p;; \
esac;
am__strip_dir = `echo $$p | sed -e 's|^.*/||'`;
am__installdirs = "$(DESTDIR)$(pdfdir)"
pdfDATA_INSTALL = $(INSTALL_DATA)
DATA = $(pdf_DATA)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run aclocal-1.9
AMDEP_FALSE = #
AMDEP_TRUE =
AMTAR = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run tar
AUTOCONF = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoconf
AUTOHEADER = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run autoheader
AUTOMAKE = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run automake-1.9
AWK = nawk
CC = gcc
CCDEPMODE = depmode=gcc3
CFLAGS = -g -O2
CPP = gcc -E
CPPFLAGS =
CYGPATH_W = echo
DEFS = -DHAVE_CONFIG_H
DEPDIR = .deps
ECHO_C =
ECHO_N = -n
ECHO_T =
EGREP = egrep
EXEEXT =
INSTALL_DATA = ${INSTALL} -m 644
INSTALL_PROGRAM = ${INSTALL}
INSTALL_SCRIPT = ${INSTALL}
INSTALL_STRIP_PROGRAM = ${SHELL} $(install_sh) -c -s
LDFLAGS =
LIBOBJS =
LIBS =
LTLIBOBJS =
MAKEINFO = ${SHELL} /people/wn/src/Release/3.0/Unix/missing --run makeinfo
OBJEXT = o
PACKAGE = WordNet
PACKAGE_BUGREPORT = wordnet@princeton.edu
PACKAGE_NAME = WordNet
PACKAGE_STRING = WordNet 3.0
PACKAGE_TARNAME = wordnet
PACKAGE_VERSION = 3.0
PATH_SEPARATOR = :
RANLIB = ranlib
SET_MAKE =
SHELL = /bin/bash
STRIP =
TCL_INCLUDE_SPEC = -I/usr/csl/include
TCL_LIB_SPEC = -L/usr/csl/lib -ltcl8.4
TK_LIBS = -L/usr/openwin/lib -lX11 -ldl -lpthread -lsocket -lnsl -lm
TK_LIB_SPEC = -L/usr/csl/lib -ltk8.4
TK_PREFIX = /usr/csl
TK_XINCLUDES = -I/usr/openwin/include
VERSION = 3.0
ac_ct_CC = gcc
ac_ct_RANLIB = ranlib
ac_ct_STRIP =
ac_prefix = /usr/local/WordNet-3.0
am__fastdepCC_FALSE = #
am__fastdepCC_TRUE =
am__include = include
am__leading_dot = .
am__quote =
am__tar = ${AMTAR} chof - "$$tardir"
am__untar = ${AMTAR} xf -
bindir = ${exec_prefix}/bin
build_alias =
datadir = ${prefix}/share
exec_prefix = ${prefix}
host_alias =
includedir = ${prefix}/include
infodir = ${prefix}/info
install_sh = /people/wn/src/Release/3.0/Unix/install-sh
libdir = ${exec_prefix}/lib
libexecdir = ${exec_prefix}/libexec
localstatedir = ${prefix}/var
mandir = ${prefix}/man
mkdir_p = $(install_sh) -d
oldincludedir = /usr/include
prefix = /usr/local/WordNet-3.0
program_transform_name = s,x,x,
sbindir = ${exec_prefix}/sbin
sharedstatedir = ${prefix}/com
sysconfdir = ${prefix}/etc
target_alias =
pdfdir = $(prefix)/doc/pdf
pdf_DATA = binsrch.3.pdf cntlist.5.pdf grind.1.pdf lexnames.5.pdf morph.3.pdf morphy.7.pdf senseidx.5.pdf uniqbeg.7.pdf wn.1.pdf wnb.1.pdf wndb.5.pdf wngloss.7.pdf wngroups.7.pdf wninput.5.pdf wnintro.1.pdf wnintro.3.pdf wnintro.5.pdf wnintro.7.pdf wnlicens.7.pdf wnpkgs.7.pdf wnsearch.3.pdf wnstats.7.pdf wnutil.3.pdf
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/pdf/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/pdf/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-pdfDATA: $(pdf_DATA)
@$(NORMAL_INSTALL)
test -z "$(pdfdir)" || $(mkdir_p) "$(DESTDIR)$(pdfdir)"
@list='$(pdf_DATA)'; for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
f=$(am__strip_dir) \
echo " $(pdfDATA_INSTALL) '$$d$$p' '$(DESTDIR)$(pdfdir)/$$f'"; \
$(pdfDATA_INSTALL) "$$d$$p" "$(DESTDIR)$(pdfdir)/$$f"; \
done
uninstall-pdfDATA:
@$(NORMAL_UNINSTALL)
@list='$(pdf_DATA)'; for p in $$list; do \
f=$(am__strip_dir) \
echo " rm -f '$(DESTDIR)$(pdfdir)/$$f'"; \
rm -f "$(DESTDIR)$(pdfdir)/$$f"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(DATA)
installdirs:
for dir in "$(DESTDIR)$(pdfdir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-pdfDATA
install-exec-am:
install-info: install-info-am
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-info-am uninstall-pdfDATA
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-pdfDATA install-strip installcheck installcheck-am \
installdirs maintainer-clean maintainer-clean-generic \
mostlyclean mostlyclean-generic pdf pdf-am ps ps-am uninstall \
uninstall-am uninstall-info-am uninstall-pdfDATA
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

View File

@ -1,2 +0,0 @@
pdfdir=$(prefix)/doc/pdf
pdf_DATA =binsrch.3.pdf cntlist.5.pdf grind.1.pdf lexnames.5.pdf morph.3.pdf morphy.7.pdf senseidx.5.pdf uniqbeg.7.pdf wn.1.pdf wnb.1.pdf wndb.5.pdf wngloss.7.pdf wngroups.7.pdf wninput.5.pdf wnintro.1.pdf wnintro.3.pdf wnintro.5.pdf wnintro.7.pdf wnlicens.7.pdf wnpkgs.7.pdf wnsearch.3.pdf wnstats.7.pdf wnutil.3.pdf

View File

@ -1,313 +0,0 @@
# Makefile.in generated by automake 1.9 from Makefile.am.
# @configure_input@
# Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002,
# 2003, 2004 Free Software Foundation, Inc.
# This Makefile.in is free software; the Free Software Foundation
# gives unlimited permission to copy and/or distribute it,
# with or without modifications, as long as this notice is preserved.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY, to the extent permitted by law; without
# even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.
@SET_MAKE@
srcdir = @srcdir@
top_srcdir = @top_srcdir@
VPATH = @srcdir@
pkgdatadir = $(datadir)/@PACKAGE@
pkglibdir = $(libdir)/@PACKAGE@
pkgincludedir = $(includedir)/@PACKAGE@
top_builddir = ../..
am__cd = CDPATH="$${ZSH_VERSION+.}$(PATH_SEPARATOR)" && cd
INSTALL = @INSTALL@
install_sh_DATA = $(install_sh) -c -m 644
install_sh_PROGRAM = $(install_sh) -c
install_sh_SCRIPT = $(install_sh) -c
INSTALL_HEADER = $(INSTALL_DATA)
transform = $(program_transform_name)
NORMAL_INSTALL = :
PRE_INSTALL = :
POST_INSTALL = :
NORMAL_UNINSTALL = :
PRE_UNINSTALL = :
POST_UNINSTALL = :
subdir = doc/pdf
DIST_COMMON = $(srcdir)/Makefile.am $(srcdir)/Makefile.in
ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
$(top_srcdir)/configure.ac
am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
mkinstalldirs = $(install_sh) -d
CONFIG_HEADER = $(top_builddir)/config.h
CONFIG_CLEAN_FILES =
SOURCES =
DIST_SOURCES =
am__vpath_adj_setup = srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`;
am__vpath_adj = case $$p in \
$(srcdir)/*) f=`echo "$$p" | sed "s|^$$srcdirstrip/||"`;; \
*) f=$$p;; \
esac;
am__strip_dir = `echo $$p | sed -e 's|^.*/||'`;
am__installdirs = "$(DESTDIR)$(pdfdir)"
pdfDATA_INSTALL = $(INSTALL_DATA)
DATA = $(pdf_DATA)
DISTFILES = $(DIST_COMMON) $(DIST_SOURCES) $(TEXINFOS) $(EXTRA_DIST)
ACLOCAL = @ACLOCAL@
AMDEP_FALSE = @AMDEP_FALSE@
AMDEP_TRUE = @AMDEP_TRUE@
AMTAR = @AMTAR@
AUTOCONF = @AUTOCONF@
AUTOHEADER = @AUTOHEADER@
AUTOMAKE = @AUTOMAKE@
AWK = @AWK@
CC = @CC@
CCDEPMODE = @CCDEPMODE@
CFLAGS = @CFLAGS@
CPP = @CPP@
CPPFLAGS = @CPPFLAGS@
CYGPATH_W = @CYGPATH_W@
DEFS = @DEFS@
DEPDIR = @DEPDIR@
ECHO_C = @ECHO_C@
ECHO_N = @ECHO_N@
ECHO_T = @ECHO_T@
EGREP = @EGREP@
EXEEXT = @EXEEXT@
INSTALL_DATA = @INSTALL_DATA@
INSTALL_PROGRAM = @INSTALL_PROGRAM@
INSTALL_SCRIPT = @INSTALL_SCRIPT@
INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
LDFLAGS = @LDFLAGS@
LIBOBJS = @LIBOBJS@
LIBS = @LIBS@
LTLIBOBJS = @LTLIBOBJS@
MAKEINFO = @MAKEINFO@
OBJEXT = @OBJEXT@
PACKAGE = @PACKAGE@
PACKAGE_BUGREPORT = @PACKAGE_BUGREPORT@
PACKAGE_NAME = @PACKAGE_NAME@
PACKAGE_STRING = @PACKAGE_STRING@
PACKAGE_TARNAME = @PACKAGE_TARNAME@
PACKAGE_VERSION = @PACKAGE_VERSION@
PATH_SEPARATOR = @PATH_SEPARATOR@
RANLIB = @RANLIB@
SET_MAKE = @SET_MAKE@
SHELL = @SHELL@
STRIP = @STRIP@
TCL_INCLUDE_SPEC = @TCL_INCLUDE_SPEC@
TCL_LIB_SPEC = @TCL_LIB_SPEC@
TK_LIBS = @TK_LIBS@
TK_LIB_SPEC = @TK_LIB_SPEC@
TK_PREFIX = @TK_PREFIX@
TK_XINCLUDES = @TK_XINCLUDES@
VERSION = @VERSION@
ac_ct_CC = @ac_ct_CC@
ac_ct_RANLIB = @ac_ct_RANLIB@
ac_ct_STRIP = @ac_ct_STRIP@
ac_prefix = @ac_prefix@
am__fastdepCC_FALSE = @am__fastdepCC_FALSE@
am__fastdepCC_TRUE = @am__fastdepCC_TRUE@
am__include = @am__include@
am__leading_dot = @am__leading_dot@
am__quote = @am__quote@
am__tar = @am__tar@
am__untar = @am__untar@
bindir = @bindir@
build_alias = @build_alias@
datadir = @datadir@
exec_prefix = @exec_prefix@
host_alias = @host_alias@
includedir = @includedir@
infodir = @infodir@
install_sh = @install_sh@
libdir = @libdir@
libexecdir = @libexecdir@
localstatedir = @localstatedir@
mandir = @mandir@
mkdir_p = @mkdir_p@
oldincludedir = @oldincludedir@
prefix = @prefix@
program_transform_name = @program_transform_name@
sbindir = @sbindir@
sharedstatedir = @sharedstatedir@
sysconfdir = @sysconfdir@
target_alias = @target_alias@
pdfdir = $(prefix)/doc/pdf
pdf_DATA = binsrch.3.pdf cntlist.5.pdf grind.1.pdf lexnames.5.pdf morph.3.pdf morphy.7.pdf senseidx.5.pdf uniqbeg.7.pdf wn.1.pdf wnb.1.pdf wndb.5.pdf wngloss.7.pdf wngroups.7.pdf wninput.5.pdf wnintro.1.pdf wnintro.3.pdf wnintro.5.pdf wnintro.7.pdf wnlicens.7.pdf wnpkgs.7.pdf wnsearch.3.pdf wnstats.7.pdf wnutil.3.pdf
all: all-am
.SUFFIXES:
$(srcdir)/Makefile.in: $(srcdir)/Makefile.am $(am__configure_deps)
@for dep in $?; do \
case '$(am__configure_deps)' in \
*$$dep*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh \
&& exit 0; \
exit 1;; \
esac; \
done; \
echo ' cd $(top_srcdir) && $(AUTOMAKE) --gnu doc/pdf/Makefile'; \
cd $(top_srcdir) && \
$(AUTOMAKE) --gnu doc/pdf/Makefile
.PRECIOUS: Makefile
Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status
@case '$?' in \
*config.status*) \
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \
*) \
echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \
cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \
esac;
$(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(top_srcdir)/configure: $(am__configure_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
$(ACLOCAL_M4): $(am__aclocal_m4_deps)
cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh
uninstall-info-am:
install-pdfDATA: $(pdf_DATA)
@$(NORMAL_INSTALL)
test -z "$(pdfdir)" || $(mkdir_p) "$(DESTDIR)$(pdfdir)"
@list='$(pdf_DATA)'; for p in $$list; do \
if test -f "$$p"; then d=; else d="$(srcdir)/"; fi; \
f=$(am__strip_dir) \
echo " $(pdfDATA_INSTALL) '$$d$$p' '$(DESTDIR)$(pdfdir)/$$f'"; \
$(pdfDATA_INSTALL) "$$d$$p" "$(DESTDIR)$(pdfdir)/$$f"; \
done
uninstall-pdfDATA:
@$(NORMAL_UNINSTALL)
@list='$(pdf_DATA)'; for p in $$list; do \
f=$(am__strip_dir) \
echo " rm -f '$(DESTDIR)$(pdfdir)/$$f'"; \
rm -f "$(DESTDIR)$(pdfdir)/$$f"; \
done
tags: TAGS
TAGS:
ctags: CTAGS
CTAGS:
distdir: $(DISTFILES)
@srcdirstrip=`echo "$(srcdir)" | sed 's|.|.|g'`; \
topsrcdirstrip=`echo "$(top_srcdir)" | sed 's|.|.|g'`; \
list='$(DISTFILES)'; for file in $$list; do \
case $$file in \
$(srcdir)/*) file=`echo "$$file" | sed "s|^$$srcdirstrip/||"`;; \
$(top_srcdir)/*) file=`echo "$$file" | sed "s|^$$topsrcdirstrip/|$(top_builddir)/|"`;; \
esac; \
if test -f $$file || test -d $$file; then d=.; else d=$(srcdir); fi; \
dir=`echo "$$file" | sed -e 's,/[^/]*$$,,'`; \
if test "$$dir" != "$$file" && test "$$dir" != "."; then \
dir="/$$dir"; \
$(mkdir_p) "$(distdir)$$dir"; \
else \
dir=''; \
fi; \
if test -d $$d/$$file; then \
if test -d $(srcdir)/$$file && test $$d != $(srcdir); then \
cp -pR $(srcdir)/$$file $(distdir)$$dir || exit 1; \
fi; \
cp -pR $$d/$$file $(distdir)$$dir || exit 1; \
else \
test -f $(distdir)/$$file \
|| cp -p $$d/$$file $(distdir)/$$file \
|| exit 1; \
fi; \
done
check-am: all-am
check: check-am
all-am: Makefile $(DATA)
installdirs:
for dir in "$(DESTDIR)$(pdfdir)"; do \
test -z "$$dir" || $(mkdir_p) "$$dir"; \
done
install: install-am
install-exec: install-exec-am
install-data: install-data-am
uninstall: uninstall-am
install-am: all-am
@$(MAKE) $(AM_MAKEFLAGS) install-exec-am install-data-am
installcheck: installcheck-am
install-strip:
$(MAKE) $(AM_MAKEFLAGS) INSTALL_PROGRAM="$(INSTALL_STRIP_PROGRAM)" \
install_sh_PROGRAM="$(INSTALL_STRIP_PROGRAM)" INSTALL_STRIP_FLAG=-s \
`test -z '$(STRIP)' || \
echo "INSTALL_PROGRAM_ENV=STRIPPROG='$(STRIP)'"` install
mostlyclean-generic:
clean-generic:
distclean-generic:
-test -z "$(CONFIG_CLEAN_FILES)" || rm -f $(CONFIG_CLEAN_FILES)
maintainer-clean-generic:
@echo "This command is intended for maintainers to use"
@echo "it deletes files that may require special tools to rebuild."
clean: clean-am
clean-am: clean-generic mostlyclean-am
distclean: distclean-am
-rm -f Makefile
distclean-am: clean-am distclean-generic
dvi: dvi-am
dvi-am:
html: html-am
info: info-am
info-am:
install-data-am: install-pdfDATA
install-exec-am:
install-info: install-info-am
install-man:
installcheck-am:
maintainer-clean: maintainer-clean-am
-rm -f Makefile
maintainer-clean-am: distclean-am maintainer-clean-generic
mostlyclean: mostlyclean-am
mostlyclean-am: mostlyclean-generic
pdf: pdf-am
pdf-am:
ps: ps-am
ps-am:
uninstall-am: uninstall-info-am uninstall-pdfDATA
.PHONY: all all-am check check-am clean clean-generic distclean \
distclean-generic distdir dvi dvi-am html html-am info info-am \
install install-am install-data install-data-am install-exec \
install-exec-am install-info install-info-am install-man \
install-pdfDATA install-strip installcheck installcheck-am \
installdirs maintainer-clean maintainer-clean-generic \
mostlyclean mostlyclean-generic pdf pdf-am ps ps-am uninstall \
uninstall-am uninstall-info-am uninstall-pdfDATA
# Tell versions [3.59,3.63) of GNU make to not export all variables.
# Otherwise a system limit (for SysV at least) may be exceeded.
.NOEXPORT:

Binary file not shown.

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More