DBW – Databases and Web development. 2023-24 Exercises (Deadline 25th Feb)
Personal web site
A free format and contents personal web site, installed at server. It should include:
- Links to Solved exercises (below)
- A "project" section including link to the presentations and link to the running application
Web application to execute an external program (CLUSTAL-Omega)
- Prepare a web application (php or python/flask, running in the course server) to perform multiple sequence alignment using Clustal-Omega (executable can be obtained from http://www.clustal.org).
It should have
- Input options:
- A set of protein sequences (in FASTA)
- A set of Uniprot ids (sequences could be obtained from https://www.uniprot.org/uniprot/{id}.fasta)
- A File upload as alternative input source
- Program options (minimum set):
- output format
- (Optional) other Clustal-O options
- check input for errors (e.g. Unkown format, No sequences available, ...) and give meningful messages
- format the output (be aware of the possible output formats), and allow to download results.
Recommended procedure:
- Prepare a local installation of ClustalO ([Clustal-O download and install]
- Test the local installation using the command-line before run it through php
- Examine ClustalO help to determine the options to include.
- Prepare the web application. You use the Blast execution from PDBBrowser example as guideline.
- Test and complete the local application
- Copy the scripts to your space on the server. Adapt the details of the installation as needed, and test.
- If flask is required, let me know
- Input options:
Design a Data Model
You are the manager of a bioinformatics support service and need to build a database to manage data from your users' studies. Define a data model (entities, atributes and relationships) to hold data from ONE of the following cases.
- A series of RNAseq (analysis). Data should include: 1) Genes included in the study. 2) Reference, suppliers, for sequencing reagents and equipment used, 3) Sample and user details, 4) Results: genes, expression values, differential expression analysis 5) References.
- A workbench to perform bioinformatics analysis. Data should include: 1) User information, 2) Tools (including required types of data and formats for input and output and parameters), 3) Data (references to actual files, that will be stored elsewhere, should include data types and formats), 4) Compatibility between tools and data types and formats, 5) Authorization rules users/data/tools.