-
-
Notifications
You must be signed in to change notification settings - Fork 730
Add support for parsing publiccode.yml as package metadata #4865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kumarasantosh
wants to merge
7
commits into
aboutcode-org:develop
Choose a base branch
from
kumarasantosh:feature/publiccode-yml-handler
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 5 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
c77ae2f
packagedcode: fix gemspec version constants being stored as-is
kumarasantosh f8d1c47
packagedcode: add gemspec version constant coverage
kumarasantosh 09a4197
git push origin fix/ibpp-license-detection-issue-3553 --force
kumarasantosh 2eae344
licenses: add IBPP License v1.1 detection [fixes #3553]
kumarasantosh ce55547
Add publiccode.yml package handler\n\nImplements a new DatafileHandle…
kumarasantosh 98a1fc0
packagedcode: remove unrelated changes from publiccode PR
kumarasantosh 1229323
packagedcode: fix publiccode license extraction
kumarasantosh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| --- | ||
| license_expression: ibpp | ||
| is_license_intro: yes | ||
| relevance: 85 | ||
| ignorable_copyrights: | ||
| - (c) Copyright 2000-2006 T.I.P. Group S.A. and the IBPP Team (www.ibpp.org) | ||
| ignorable_holders: | ||
| - T.I.P. Group S.A. and the IBPP Team | ||
| ignorable_urls: | ||
| - http://www.ibpp.org/ | ||
| --- | ||
| IBPP License v1.1 | ||
| ----------------- | ||
|
|
||
| (C) Copyright 2000-2006 T.I.P. Group S.A. and the IBPP Team (www.ibpp.org) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| license_expression: ibpp | ||
| is_license_reference: yes | ||
| relevance: 90 | ||
| --- | ||
| IBPP License, see appendix |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,133 @@ | ||
| # | ||
| # Copyright (c) nexB Inc. and others. All rights reserved. | ||
| # ScanCode is a trademark of nexB Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # See http://www.apache.org/licenses/LICENSE-2.0 for the license text. | ||
| # See https://github.com/nexB/scancode-toolkit for support or download. | ||
| # See https://aboutcode.org for more information about nexB OSS projects. | ||
| # | ||
|
|
||
| import logging | ||
| import os | ||
|
|
||
| import saneyaml | ||
|
|
||
| from packagedcode import models | ||
|
|
||
| """ | ||
| Handle publiccode.yml metadata files. | ||
| publiccode.yml is a metadata standard for public sector open source software. | ||
| See https://github.com/publiccodeyml/publiccode.yml | ||
| """ | ||
|
|
||
| TRACE = os.environ.get('SCANCODE_DEBUG_PACKAGE', False) | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class PubliccodeYmlHandler(models.DatafileHandler): | ||
| datasource_id = 'publiccode_yml' | ||
| path_patterns = ('*/publiccode.yml', '*/publiccode.yaml') | ||
| default_package_type = 'publiccode' | ||
| default_primary_language = None | ||
| description = 'publiccode.yml metadata file' | ||
| documentation_url = 'https://github.com/publiccodeyml/publiccode.yml' | ||
|
|
||
| @classmethod | ||
| def parse(cls, location, package_only=False): | ||
| with open(location, 'rb') as f: | ||
| data = saneyaml.load(f.read()) | ||
|
|
||
| if not data or not isinstance(data, dict): | ||
| return | ||
|
|
||
| # Validate: a publiccode.yml must have 'publiccodeYmlVersion' | ||
| if 'publiccodeYmlVersion' not in data: | ||
| return | ||
|
|
||
| name = data.get('name') | ||
| version = data.get('softwareVersion') | ||
| vcs_url = data.get('url') | ||
| homepage_url = data.get('landingURL') or vcs_url | ||
|
|
||
| # License is under legal.license (SPDX expression) | ||
| legal = data.get('legal') or {} | ||
| declared_license = legal.get('license') | ||
| copyright_statement = legal.get('mainCopyrightOwner') or legal.get('repoOwner') | ||
|
|
||
| # Description: prefer English, fall back to first available language | ||
| description = _get_description(data) | ||
|
|
||
| # Keywords from categories | ||
| categories = data.get('categories') or [] | ||
| keywords = ', '.join(categories) if categories else None | ||
|
|
||
| # Parties from maintenance.contacts | ||
| parties = [] | ||
| maintenance = data.get('maintenance') or {} | ||
| for contact in maintenance.get('contacts') or []: | ||
| contact_name = contact.get('name') | ||
| contact_email = contact.get('email') | ||
| if contact_name or contact_email: | ||
| parties.append( | ||
| models.Party( | ||
| type=models.party_person, | ||
| name=contact_name, | ||
| email=contact_email, | ||
| role='maintainer', | ||
| ) | ||
| ) | ||
|
|
||
| # Extra data | ||
| extra_data = {} | ||
| schema_version = data.get('publiccodeYmlVersion') | ||
| if schema_version: | ||
| extra_data['publiccodeYmlVersion'] = schema_version | ||
| platforms = data.get('platforms') | ||
| if platforms: | ||
| extra_data['platforms'] = platforms | ||
| development_status = data.get('developmentStatus') | ||
| if development_status: | ||
| extra_data['developmentStatus'] = development_status | ||
| software_type = data.get('softwareType') | ||
| if software_type: | ||
| extra_data['softwareType'] = software_type | ||
|
|
||
| yield models.PackageData( | ||
| datasource_id=cls.datasource_id, | ||
| type=cls.default_package_type, | ||
| name=name, | ||
| version=version, | ||
| vcs_url=vcs_url, | ||
| homepage_url=homepage_url, | ||
| description=description, | ||
| declared_license_expression=declared_license, | ||
| copyright=copyright_statement, | ||
| keywords=keywords, | ||
| parties=parties, | ||
| extra_data=extra_data or None, | ||
| ) | ||
|
|
||
|
|
||
| def _get_description(data): | ||
| """ | ||
| Extract the best available description from publiccode.yml's | ||
| multilingual 'description' block. Prefer English, fall back to | ||
| any available language. Returns longDescription, else shortDescription. | ||
| """ | ||
| description_block = data.get('description') or {} | ||
| if not description_block: | ||
| return | ||
|
|
||
| lang_data = ( | ||
| description_block.get('en') | ||
| or description_block.get('eng') | ||
| or next(iter(description_block.values()), None) | ||
| ) | ||
| if not lang_data: | ||
| return | ||
|
|
||
| long_desc = lang_data.get('longDescription', '').strip() | ||
| short_desc = lang_data.get('shortDescription', '').strip() | ||
|
|
||
| return long_desc or short_desc or None | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| IBPP License v1.1 | ||
| ----------------- | ||
|
|
||
| (C) Copyright 2000-2006 T.I.P. Group S.A. and the IBPP Team (www.ibpp.org) | ||
|
|
||
| Permission is hereby granted, free of charge, to any person or organization | ||
| ("You") obtaining a copy of this software and associated documentation files | ||
| covered by this license (the "Software") to use the Software as part of another | ||
| work; to modify it for that purpose; to publish or distribute it, modified or | ||
| not, for that same purpose; to permit persons to whom the other work using the | ||
| Software is furnished to do so; subject to the following conditions: the above | ||
| copyright notice and this complete and unmodified permission notice shall be | ||
| included in all copies or substantial portions of the Software; You will not | ||
| misrepresent modified versions of the Software as being the original. | ||
|
|
||
| The Software is provided "as is", without warranty of any kind, express or | ||
| implied, including but not limited to the warranties of merchantability, | ||
| fitness for a particular purpose and noninfringement. In no event shall | ||
| the authors or copyright holders be liable for any claim, damages or other | ||
| liability, whether in an action of contract, tort or otherwise, arising from, | ||
| out of or in connection with the software or the use of other dealings in | ||
| the Software. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| license_expressions: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are changes from other issues/PRs here? |
||
| - ibpp | ||
26 changes: 26 additions & 0 deletions
26
tests/licensedcode/data/datadriven/lic1/wt_ibpp_interference.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| passwdqc | ||
| Copyright (c) 2000-2002 by Solar Designer | ||
| Copyright (c) 2008,2009 by Dmitry V. Levin | ||
| Redistribution and use in source and binary forms, with or without | ||
| modification, are permitted. | ||
|
|
||
| THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND | ||
| ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
| IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | ||
| ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE | ||
| FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
| DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | ||
| OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | ||
| HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | ||
| LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | ||
| OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | ||
| SUCH DAMAGE. | ||
|
|
||
| | IBPP | Wt::Dbo Firebird backend | IBPP License, see appendix | Copyright 2000-2006 T.I.P. Group S.A. and the IBPP Team | | ||
|
|
||
| ### IBPP | ||
|
|
||
| IBPP License v1.1 | ||
| ----------------- | ||
|
|
||
| (C) Copyright 2000-2006 T.I.P. Group S.A. and the IBPP Team (www.ibpp.org) |
5 changes: 5 additions & 0 deletions
5
tests/licensedcode/data/datadriven/lic1/wt_ibpp_interference.md.yml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| notes: Minimal Wt-derived regression fixture for issue #3553, keeping the passwdqc disclaimer immediately before the IBPP reference and appendix intro. | ||
| license_expressions: | ||
| - bsd-1-clause | ||
| - ibpp | ||
| - ibpp |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| publiccodeYmlVersion: "0.4" | ||
|
|
||
| name: Medusa | ||
| url: "https://example.com/italia/medusa.git" | ||
| landingURL: "https://example.com/medusa" | ||
| softwareVersion: "1.0.3" | ||
|
|
||
| platforms: | ||
| - web | ||
| - linux | ||
|
|
||
| categories: | ||
| - financial-reporting | ||
| - accounting | ||
|
|
||
| developmentStatus: stable | ||
| softwareType: "standalone/desktop" | ||
|
|
||
| description: | ||
| en: | ||
| shortDescription: > | ||
| A short description of this software. | ||
| longDescription: > | ||
| A very long description of this software. It explains what it does, | ||
| who it is for, and why you might want to use it in a public | ||
| administration context. | ||
| features: | ||
| - Feature one | ||
| - Feature two | ||
|
|
||
| legal: | ||
| license: AGPL-3.0-or-later | ||
| mainCopyrightOwner: City of Example | ||
| repoOwner: City of Example | ||
|
|
||
| maintenance: | ||
| type: "contract" | ||
| contacts: | ||
| - name: Francesco Rossi | ||
| email: f.rossi@example.com | ||
| affiliation: City of Example | ||
|
|
||
| localisation: | ||
| localisationReady: true | ||
| availableLanguages: | ||
| - en | ||
| - it |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to populate the extracted license statement, which we do a license detection on, you cannot directly populate the license expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid SPDX license expressions are supported and parsed automatically https://github.com/publiccodeyml/publiccode.yml/blob/main/docs/standard/schema.core.rst#key-legallicense