Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
53aa47c
Install diamond/blastp
eweizy Mar 26, 2025
55b7632
Unfinished diamond/blastp integraton
eweizy Mar 26, 2025
3c5b661
Add diamond
eweizy Mar 26, 2025
b1fdc74
reinstalled diamond/blastp module. Installed blast/makeblastdb
tracelail May 21, 2025
9f1ea67
wrote draft integration of BLAST_MAKEBLASTDB and NCBIREFSEQDOWNLOAD i…
tracelail Jun 2, 2025
52506f7
installed diamond makedb
tracelail Jun 3, 2025
49ee17c
cleared nf-test logs and added tuple output for main.nf.test of diam…
tracelail Jun 3, 2025
6ab82f3
Merge branch 'dev' of https://github.com/tracelail/proteinannotator i…
tracelail Jun 3, 2025
89fb03e
Merge branch 'dev' of https://github.com/nf-core/proteinannotator int…
tracelail Jun 3, 2025
f12c619
removed blast/makeblastdb nf-core module and create local diamondprep…
tracelail Jun 10, 2025
4f4db82
finished up writing the ncbirefseqdownload process for the first draf…
tracelail Jun 17, 2025
3946ba5
edited ncbirefseqdownload script to a working stated where the nf-tes…
tracelail Jun 18, 2025
94ebe1b
created a working ncbirefseqdownload module with basic nf-test. Also …
tracelail Jun 26, 2025
06bf5e8
added more working tests to ncbirefseqdownload and organized.
tracelail Jun 27, 2025
7681f82
Added diamondpreparetaxa main.nf script and a process.success nf-test.
tracelail Jun 30, 2025
0cf27f7
Merge branch 'dev' into unfinished-diamond-blastp
tracelail Jul 1, 2025
92c847c
added working snapshot assertions for process.out match and versions …
tracelail Jul 2, 2025
043451f
Added output documentation for all seven Diamond subworkflow outputs.
tracelail Jul 7, 2025
f59c0b7
wrote a potential subworkflow for diamond as well as a test. Copied t…
tracelail Jul 9, 2025
cf259d3
added stub portion to ncbirefseqdownload. Added all output emits to d…
tracelail Jul 9, 2025
9fd5d3f
Added stub section to diamondpreparetaxa module.
tracelail Jul 9, 2025
fa390ce
created a simple flow diagram of the diamond subworkflow and it's mod…
tracelail Jul 9, 2025
4eb5ad1
Apply suggestions from code review
tracelail Jul 28, 2025
71ff9ef
Created workflow success tests for diamond subworkflow. Added nextflo…
tracelail Jul 29, 2025
a4f00be
Merge branch 'dev' of https://github.com/nf-core/proteinannotator int…
tracelail Jul 29, 2025
16e10b2
Merge remote-tracking branch 'refs/remotes/origin/unfinished-diamond-…
tracelail Jul 29, 2025
c970041
corrected typo in diamondpreparetaxa container
tracelail Jul 30, 2025
925ed89
updated diamond/makedb module
tracelail Jul 30, 2025
7b8a6e1
removed params.diamond_blast_columns = 'qseqid' to resolve testing co…
tracelail Jul 31, 2025
1f5ba7a
updated nf-core module diamond/blastp to match diamond/makedb
tracelail Jul 31, 2025
8edc405
working subworkflow nf-test with large prot.accession2taxid.gz taxonmap.
tracelail Aug 6, 2025
a3e661c
Updated diamond subworkflow main.nf.test with a smaller taxonmap for …
tracelail Aug 7, 2025
3fbcc6a
created a local diamond subworkflow that produces a diamond/blastp ou…
tracelail Aug 21, 2025
a4911fc
Added example outputs for DIAMOND subworkflow. Added default paramete…
tracelail Aug 26, 2025
64ed6dc
Added usage documentation for DIAMOND subworkflow.
tracelail Aug 26, 2025
556b3e3
Updated nextflow_schema and readme.
tracelail Aug 26, 2025
f5b63c2
minimal label edit to functional annotation workflow.
tracelail Aug 26, 2025
6cd3a20
updated nextflow_schema and nextflow.config
tracelail Aug 26, 2025
3ffbc1a
changed diamond_blast_columns values to null for string inputs
tracelail Oct 2, 2025
0b1df66
added meta.yml info and some tags for functional annotation subworkflow
tracelail Oct 2, 2025
2257719
made stub updates to diamondpreparetaxa and ncbirefseqdownload and ad…
tracelail Oct 2, 2025
7632328
Addeded diamond_blast_columns = "" back as being null caused issues. …
tracelail Oct 3, 2025
c1f0c63
deleted interproscan functional annotation subworkflow tests to remov…
tracelail Oct 3, 2025
b58e6f2
manually resolved functional_annotation test merge preparation
tracelail Mar 30, 2026
2041cfc
Merge branch 'dev' into unfinished-diamond-blastp
tracelail Mar 30, 2026
4dcba98
updated missed merge conflict
tracelail Mar 31, 2026
6a39985
minor updates for cleaning and future version implementation in local…
tracelail Apr 3, 2026
8dfc3ea
Updated modules.json to remove mmseqs/search that would break. Can be…
tracelail Apr 3, 2026
0c9ba7d
Addressed some liniting issues in modules and schema json
tracelail Apr 6, 2026
60e36fa
Updated schema and config to fix lint error. Updated readme with diam…
tracelail Apr 6, 2026
a2e2c02
diamond subworkflow needed a default with elvis operator and the diam…
tracelail Apr 9, 2026
ac1a9c8
updated diamond nf-core modules
tracelail Apr 9, 2026
9a7dace
Diamond modules update had new versioning. Removed old version emit c…
tracelail Apr 9, 2026
67617ad
updated resolution conflicts
tracelail Apr 9, 2026
63facff
Merge branch 'dev' into unfinished-diamond-blastp
tracelail Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ null/
.nf-test.log
.nf-test/tests
.nf-test-*.nf
.nf-test/*
.nf-test/*
8 changes: 8 additions & 0 deletions .nf-test.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Apr-09 11:11:33.550 [main] INFO com.askimed.nf.test.App - nf-test 0.9.4
Apr-09 11:11:33.572 [main] INFO com.askimed.nf.test.App - Arguments: [test, subworkflows/local/domain_annotation, --profile, test,docker, --update-snapshot, --tag, stub]
Apr-09 11:11:34.489 [main] INFO com.askimed.nf.test.App - Nextflow Version: 25.10.4
Apr-09 11:11:34.491 [main] INFO com.askimed.nf.test.commands.RunTestsCommand - Load config from file /home/trace/projects/proteinannotator/nf-test.config...
Apr-09 11:11:35.532 [main] INFO com.askimed.nf.test.lang.dependencies.DependencyResolver - Loaded 37 files from directory /home/trace/projects/proteinannotator in 0.188 sec
Apr-09 11:11:35.535 [main] INFO com.askimed.nf.test.lang.dependencies.DependencyResolver - Found 1 files containing tests.
Apr-09 11:11:35.535 [main] DEBUG com.askimed.nf.test.lang.dependencies.DependencyResolver - Found files: [/home/trace/projects/proteinannotator/subworkflows/local/domain_annotation/tests/main.nf.test]
Apr-09 11:11:35.960 [main] INFO com.askimed.nf.test.commands.RunTestsCommand - Found 0 tests to execute.
5 changes: 4 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
{
"markdown.styles": ["public/vscode_markdown.css"]
"markdown.styles": [
"public/vscode_markdown.css"
],
"nextflow.telemetry.enabled": true
}
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Added`

- [[PR #52](https://github.com/nf-core/proteinannotator/pull/52)] Add option to turn off InterProScan for testing
- [[PR #51](https://github.com/nf-core/proteinannotator/pull/51)] Update to nf-core/tools v3.3.1
- [[PR #50](https://github.com/nf-core/proteinannotator/pull/50)] Add DIAMOND subworkflow to run [Diamond](https://github.com/bbuchfink/diamond)
- [[PR #47](https://github.com/nf-core/proteinannotator/pull/47)] Update metromap with more tools added from [May 2025 Hackathon](https://nf-co.re/events/2025/hackathon-boston)
- [[PR #43](https://github.com/nf-core/proteinannotator/pull/44)] Add [mTM-Align](https://nf-co.re/modules/mtmalign_align/) and [MMseqs2 Search](https://nf-co.re/modules/mmseqs_search/) modules
- [[PR #42](https://github.com/nf-core/proteinannotator/pull/42)] Updated to `nf-test` on GitHub Actions and in the `PULL_REQUEST_TEMPLATE.md`
- [[PR #13](https://github.com/nf-core/proteinannotator/pull/13)] Add nf-core seqkit/stats module
- [[PR #9](https://github.com/nf-core/proteinannotator/pull/9)] Add [InterProScan](https://interproscan-docs.readthedocs.io/) module
- [#90](https://github.com/nf-core/proteinannotator/pull/90) - Added the option to download and use the latest `metagRoot` HMM library (or use path to an existing one) for domain annotation. (by @angelphanth)
- [#90](https://github.com/nf-core/proteinannotator/pull/90) - Added the option to download and use the latest `metagRoot` HMM library (or use path to an existing one) for domain annotation. (by @angelphanth)
- [#87](https://github.com/nf-core/proteinannotator/pull/87) - Added the option to download and use the latest `NMPFams` HMM library (or use path to an existing one) for domain annotation. (by @npechl)
- [#85](https://github.com/nf-core/proteinannotator/pull/85) - Added zenodo doi in `nextflow.config`. (by @vagkaratzas)
Expand Down
5 changes: 5 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@

## Pipeline tools


- [DIAMOND](https://github.com/bbuchfink/diamond)

> Buchfink B, Xie C, Huson DH, "Fast and sensitive protein alignment using DIAMOND", Nature Methods 12, 59-60 (2015). doi:10.1038/nmeth.3176

- [SeqFu](https://pubmed.ncbi.nlm.nih.gov/34066939/)

> Telatin A, Fariselli P, Birolo G. SeqFu: a suite of utilities for the robust and reproducible manipulation of sequence files. Bioengineering. 2021 May 7;8(5):59. doi: 10.3390/bioengineering8050059. PubMed PMID: 34066939; PubMed Central PMCID: PMC8148589.
Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,14 @@

**nf-core/proteinannotator** is a bioinformatics pipeline that computes statistics for protein FASTA inputs and produces protein annotations based on predicted sequence features, including conserved domains, functions, and secondary structure.

1. Run ([`seqkit stats`](https://bioinf.shenwei.me/seqkit/usage/#stats)) to summarize input protein fasta files
2. Functional Annotation:
1. ([`InterProScan`](https://interproscan-docs.readthedocs.io/en/v5/)) a software tool used to analyze protein sequences by scanning them against the signatures of protein families, domains, and sites in the [InterPro](https://www.ebi.ac.uk/interpro/) database, helping to identify their functional characteristics.
2. ([`DIAMOND`](https://github.com/bbuchfink/diamond)) tool used for sensitive protein sequence alignment, comparing to a reference database created from combined protein fastas and taxonic information (taxon names, nodes, and map).
3. Present QC for raw reads ([`MultiQC`](http://multiqc.info/))


<h1>
<p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="docs/images/proteinannotator_metromap_dark.png">
Expand All @@ -40,6 +48,7 @@ Generate input amino acid sequence statistics with ([`SeqFu`](https://github.com
such as [Pfam](https://ftp.ebi.ac.uk/pub/databases/Pfam/), [FunFam](https://download.cathdb.info/cath/releases/all-releases/), and [NMPFams and metagRoot](https://pavlopoulos-lab.org/envofams/databases/hmmer/)
2. Functional annotation:
- ([`InterProScan`](https://interproscan-docs.readthedocs.io/en/v5/)) a software tool used to analyze protein sequences by scanning them against the signatures of protein families, domains, and sites in the [InterPro](https://www.ebi.ac.uk/interpro/) database, helping to identify their functional characteristics.
- ([`DIAMOND`](https://github.com/bbuchfink/diamond)) a rapid and sensitive protein sequence aligner used to search input sequences against a reference database built from NCBI RefSeq protein sequences with taxonomic information, providing potential homologous protein matches across species.
3. Predict secondary structure compositional features such as α-helices, β-strands and coils with ([`s4pred`](https://github.com/psipred/s4pred))
4. Present QC stats for input sequences before and after initial pre-processing with ([`MultiQC`](http://multiqc.info/))

Expand Down
Loading
Loading