diff --git a/README.md b/README.md index 6ea92d2..d7ee296 100644 --- a/README.md +++ b/README.md @@ -3,19 +3,22 @@ custom react frontend working with consort clowder instance ## Project documentation -https://uofi.app.box.com/folder/338244434169 + +[https://uofi.app.box.com/folder/338244434169](https://uofi.app.box.com/folder/338244434169) ## Install Dependencies + - Install node version 22.0.0, npm version 10.5.1 - Run `npm install` ## Run Project - Set the environment variables in `server/.env` file. See `.template.env` file. -- Set CLOWDER_PREFIX, CILOGON_CALLBACK_URL, and CLOWDER_REMOTE_HOSTNAME values different for Consort instance (https://consort.clowderframework.org) and local development +- Set CLOWDER_PREFIX, CILOGON_CALLBACK_URL, and CLOWDER_REMOTE_HOSTNAME values different for Consort instance ([https://consort.clowderframework.org](https://consort.clowderframework.org)) and local development - run `npm start` to build react client and start the express server. ### Dev + run `npm start` ## Build and push image to NCSA hub @@ -23,28 +26,31 @@ run `npm start` - Run `docker build . -t hub.ncsa.illinois.edu/clowder/consort-frontend:` to build docker image - Image needs to be built in a linux machine. Use NCSA Radiant VMs if using a non-compatible work laptop (Eg: Mac with apple silicon) - If you ran into error `[Errno 28] No space left on device:`, try below: - - Free more spaces by running `docker system prune --all` - - Increase the Disk image size. You can find the configuration in Docker Desktop - + - Free more spaces by running `docker system prune --all` + - Increase the Disk image size. You can find the configuration in Docker Desktop - Login first: `docker login hub.ncsa.illinois.edu` - Run `docker image push hub.ncsa.illinois.edu/clowder/consort-frontend:` - ## Local testing + 1. Change package.json start to `"start": "npm-run-all --parallel open:src"`. This will not build the server side. 2. `utils.common.js` instead of axios request, return the clowder specific api key as in comments. 3. In `route.tsx` import `import CreateAndUpload from "./components/childComponents/CreateAndUpload";` and add `}/>` 4. Go to localhost:3000/create to test the basic functionality without login. ### Test only Preview components -1. Change `src/components/Preview.js` + +1. Change `src/components/Preview.js` + ``` import Pdf from "./previewers/Pdf"; ``` -2. Change `components/previewers/Pdf.js ` to import a static pdf and json file directly from local directory. + +1. Change `components/previewers/Pdf.js` to import a static pdf and json file directly from local directory. + ``` import pdfFile from "../../../main.pdf"; import metadataFile from "../../../main.json"; @@ -54,23 +60,27 @@ if (metadata == undefined){ } ``` -3. On terminal type "npm start". Point browser to localhost:3000/preview +1. On terminal type "npm start". Point browser to localhost:3000/preview ## Code explanation + 1. Server side + - Code `./server` - Does auth using CILogon. - RCTDB (postgresDB) schema and connection will be configured here. -2. Client side. +1. Client side. + - Code `./src` - `app.config.js` has some default configs on the extractor names - which extractor to trigger. - `routes.tsx` defines the routes. - When user is signed in, `/home` route is provided. There's seperate route for Preview and FAQ page. -- `/home` route +- `/home` route 2.1. CreateAndUpload component + - `CreateAndUpload` is the main component - The dropbox for file upload accepts pdf,doc and docx formats. - The statment choice and report/preview choice should be done before dropping the file. Once an accepted file format is dropped, the extractors are triggered with the default choice of values(radio button values). @@ -79,39 +89,44 @@ if (metadata == undefined){ - The "View Results" button will either download the report or navigate to `preview` page to show the highglited preview. 2.2 actions/client.js + - Creates an empty dataset with the same name as the uploaded filename - Uploads the file - Calls `wordPipieline` `pdfPipeline` whether the filetype is pdf/word. 2.3 wordPipeline `utils.word_pipeline.js` + - Submits file for extraction to sOffice extractor. - Checks if a pdf file is generated. - Once pdf file is generated, triggers the `pdfPipeline` 2.4 pdfPipeline `utils.pdfPipeline` + - submits file for extraction to pdf2text extractor - Checks if a csv file is generated. - Once csv is generated, triggers the `csvPipeline` 2.5 csvPipeline `utils.csv_pipeline` + - Submits file for rct-extractor - Checks if a dataset metada is created with the extractor name of rct-extractor for completing the process. -3. Previews +1. Previews + - FilePreview component is shown in `/preview` route. - Get file previews using method `getPreviewResources` - If the preview type is "thumbnail" or "pdf" the original pdf file with highlights and a PreviewDrawerLeft component is shown. -4. Pdf Preview +1. Pdf Preview + - The pdf file rendering with canvas highlights is done in this component. In `components/previewers/Pdf.js` - The content from the highlights json file is used to render this. - The highlights/content is rendered page py page using method `getPageHighlights` - The colors used for highlights for different statement (spirit/consort) is in `components/styledComponents/HighlightColors.js` - The labels for the sentences are placed on the sides/margin of the page next to the sentence. Theres a collision avoidance mechanism given that there can be multiple labels per sentence. From the testings, 3 labels per sentence can be shown without major rendering issues. -5. PreviewDrawerLeft component +1. PreviewDrawerLeft component + - This component drives the left side drawer. In file `childComponents/PreviewDrawerLeft` - Uses the data in highlights json file to render this - - diff --git a/server/.templateenv b/server/.templateenv index 9362827..ee6672a 100644 --- a/server/.templateenv +++ b/server/.templateenv @@ -16,9 +16,13 @@ UI_URL= CLOWDER_PREFIX='' CLOWDER_REMOTE_HOSTNAME= APIKEY= +# Clowder API version: 'v2' for Clowder2, 'v1' or unset for Clowder v1 +CLOWDER_API_VERSION=v2 +# Required for Clowder2 when creating datasets (license_id query param) +CLOWDER_DEFAULT_LICENSE_ID= # Database Configuration PGSERVER should be host.docker.internal for dockerized environment and localhost for local environment -PGSERVER=host.docker.internal +PGSERVER=host.docker.internal PGPORT=5432 PGUSER=postgres PGPASSWORD=postgres diff --git a/server/app.js b/server/app.js index 300af9e..b09319f 100644 --- a/server/app.js +++ b/server/app.js @@ -133,8 +133,9 @@ app.use(function(req, res, next) { //const baseUrl = process.env.BASE_URL; app.use('/', indexRouter); app.use('/', authRouter); -app.use('/', clowderRouter); +// Before clowderRouter's /api/* proxy (which would send /api/rctdb to Clowder). app.use('/api/rctdb', rctdbRouter); +app.use('/', clowderRouter); app.use('/home',express.static('../dist')); app.use('/public',express.static('../dist/public')); diff --git a/server/routes/clowder.js b/server/routes/clowder.js index 06c5935..c02c9d2 100644 --- a/server/routes/clowder.js +++ b/server/routes/clowder.js @@ -40,14 +40,34 @@ function setCorsHeaders(req, res) { res.setHeader('Access-Control-Max-Age', '86400'); // 24 hours } +/** + * Check if Clowder2 API is enabled + */ +function isClowder2() { + return process.env.CLOWDER_API_VERSION === 'v2'; +} + +/** + * Transform query params for Clowder2: superAdmin -> enable_admin + */ +function transformQueryForClowder2(queryString) { + const queryParams = new URLSearchParams(queryString); + if (queryParams.has('superAdmin')) { + queryParams.set('enable_admin', queryParams.get('superAdmin')); + queryParams.delete('superAdmin'); + } + return queryParams.toString() ? `?${queryParams.toString()}` : ''; +} + /** * Helper function to proxy requests to Clowder API * @param {Object} req - Express request object * @param {Object} res - Express response object * @param {string} apiPath - Optional API path (if not provided, extracts from req.path) * @param {string} queryString - Optional query string to append + * @param {Object} options - Optional overrides: method, body, headers */ -async function proxyToClowder(req, res, apiPath = null, queryString = null) { +async function proxyToClowder(req, res, apiPath = null, queryString = null, options = {}) { // Set CORS headers for all requests setCorsHeaders(req, res); @@ -55,6 +75,7 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { const CLOWDER_REMOTE_HOSTNAME = process.env.CLOWDER_REMOTE_HOSTNAME; const APIKEY = process.env.APIKEY; const PREFIX = process.env.CLOWDER_PREFIX || ''; + const useV2 = isClowder2(); if (!CLOWDER_REMOTE_HOSTNAME || !APIKEY) { return res.status(500).json({ error: 'Server configuration error' }); @@ -62,31 +83,22 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { // Extract the path after /api/ or use provided apiPath const targetApiPath = apiPath || req.path.replace(/^\/api/, ''); - // Reconstruct the full Clowder API path: CLOWDER_REMOTE_HOSTNAME + PREFIX + /api + remaining path - const targetUrl = `${CLOWDER_REMOTE_HOSTNAME}${PREFIX}/api${targetApiPath}`; + // Reconstruct the full Clowder API path: CLOWDER_REMOTE_HOSTNAME + PREFIX + /api[/v2] + remaining path + const apiPrefix = useV2 ? '/api/v2' : '/api'; + const targetUrl = `${CLOWDER_REMOTE_HOSTNAME}${PREFIX}${apiPrefix}${targetApiPath}`; // Use provided query string or preserve from request let finalQueryString = ''; if (queryString !== null) { - // If queryString is provided, parse and remove superAdmin if present if (queryString) { - const queryParams = new URLSearchParams(queryString); - if (queryParams.has('superAdmin')) { - queryParams.delete('superAdmin'); - } - finalQueryString = queryParams.toString() ? `?${queryParams.toString()}` : ''; + finalQueryString = useV2 ? transformQueryForClowder2(queryString) : (queryString.startsWith('?') ? queryString : `?${queryString}`); } else { finalQueryString = ''; } } else { - // Otherwise, preserve query string from request and remove superAdmin if present if (req.url.includes('?')) { const existingQuery = req.url.substring(req.url.indexOf('?') + 1); - const queryParams = new URLSearchParams(existingQuery); - if (queryParams.has('superAdmin')) { - queryParams.delete('superAdmin'); - } - finalQueryString = queryParams.toString() ? `?${queryParams.toString()}` : ''; + finalQueryString = useV2 ? transformQueryForClowder2(existingQuery) : `?${existingQuery}`; } else { finalQueryString = ''; } @@ -99,7 +111,6 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { }; // Copy relevant headers from the client request, but exclude host, content-length, and connection - // For multipart/form-data, we'll set the content-type header later with the boundary const contentType = req.headers['content-type'] || ''; const isMultipart = contentType.includes('multipart/form-data'); @@ -119,20 +130,27 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { } // Prepare request options - const options = { - method: req.method, + const requestOptions = { + method: options.method || req.method, headers: headers, }; // Handle request body for POST/PUT/PATCH - if (['POST', 'PUT', 'PATCH'].includes(req.method)) { - if (contentType.includes('application/json')) { - options.body = JSON.stringify(req.body); + const bodySource = options.body !== undefined ? options.body : req.body; + const method = requestOptions.method || req.method; + + if (['POST', 'PUT', 'PATCH'].includes(method)) { + if (options.body !== undefined) { + requestOptions.body = typeof options.body === 'string' ? options.body : JSON.stringify(options.body); + if (typeof options.body !== 'string') { + headers['Content-Type'] = 'application/json'; + } + } else if (contentType.includes('application/json')) { + requestOptions.body = JSON.stringify(bodySource); } else if (isMultipart) { // For multipart/form-data, reconstruct FormData const formData = new FormData(); - // Add fields from req.body if (req.body) { Object.keys(req.body).forEach(key => { if (req.body[key] !== undefined && req.body[key] !== null) { @@ -141,11 +159,12 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { }); } - // Add files from req.files if (req.files) { if (Array.isArray(req.files)) { req.files.forEach(file => { - formData.append(file.fieldname, file.buffer, { + // Clowder2 expects "file" field; Clowder v1 uses "File" + const fieldName = useV2 ? 'file' : (file.fieldname || 'File'); + formData.append(fieldName, file.buffer, { filename: file.originalname, contentType: file.mimetype }); @@ -154,7 +173,8 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { Object.keys(req.files).forEach(key => { const files = Array.isArray(req.files[key]) ? req.files[key] : [req.files[key]]; files.forEach(file => { - formData.append(key, file.buffer, { + const fieldName = useV2 && key === 'File' ? 'file' : key; + formData.append(fieldName, file.buffer, { filename: file.originalname, contentType: file.mimetype }); @@ -163,33 +183,31 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { } } - // Add single file from req.file if (req.file) { - formData.append(req.file.fieldname || 'file', req.file.buffer, { + const fieldName = useV2 ? 'file' : (req.file.fieldname || 'file'); + formData.append(fieldName, req.file.buffer, { filename: req.file.originalname, contentType: req.file.mimetype }); } - options.body = formData; - // Set content-type header with boundary from form-data + requestOptions.body = formData; const formDataHeaders = formData.getHeaders(); headers['Content-Type'] = formDataHeaders['content-type']; - options.headers = headers; - } else if (req.body) { - options.body = req.body; + requestOptions.headers = headers; + } else if (bodySource) { + requestOptions.body = bodySource; } } // Make the request to Clowder - const response = await fetch(fullUrl, options); + const response = await fetch(fullUrl, requestOptions); - // Set CORS headers again before sending response (in case they were overwritten) + // Set CORS headers again before sending response setCorsHeaders(req, res); // Copy response headers response.headers.forEach((value, key) => { - // Skip headers that shouldn't be forwarded if (key.toLowerCase() !== 'content-encoding' && key.toLowerCase() !== 'transfer-encoding' && key.toLowerCase() !== 'connection' && @@ -211,17 +229,14 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { responseContentType.includes('application/zip') || responseContentType.includes('application/pdf') )) { - // For binary data, send as buffer const buffer = await response.buffer(); return res.status(response.status).send(buffer); } else { - // For text responses const text = await response.text(); return res.status(response.status).send(text); } } catch (error) { console.error('Proxy error:', error); - // Ensure CORS headers are set even on errors setCorsHeaders(req, res); res.status(500).json({ error: 'Proxy request failed', message: error.message }); } @@ -229,27 +244,11 @@ async function proxyToClowder(req, res, apiPath = null, queryString = null) { /** * Middleware to ALWAYS set CORS headers for all requests - * This must run before ensureLoggedIn so CORS headers are present even if auth fails */ function handleCorsHeaders(req, res, next) { - // Always set CORS headers for all requests setCorsHeaders(req, res); - // Handle OPTIONS preflight requests - if (req.method === 'OPTIONS') { - return res.status(200).end(); - } - next(); -} - -/** - * Middleware to handle CORS preflight requests BEFORE authentication - * This must run before ensureLoggedIn to prevent redirects on OPTIONS requests - * @deprecated Use handleCorsHeaders instead - */ -function handleCorsPreflight(req, res, next) { if (req.method === 'OPTIONS') { - setCorsHeaders(req, res); return res.status(200).end(); } next(); @@ -259,35 +258,49 @@ function handleCorsPreflight(req, res, next) { * Middleware to handle CORS preflight and file uploads */ function handleCorsAndUpload(req, res, next) { - // Handle CORS preflight requests - skip multer for OPTIONS if (req.method === 'OPTIONS') { setCorsHeaders(req, res); return res.status(200).end(); } - // Apply multer only for non-OPTIONS requests upload.any()(req, res, next); } /** * Global OPTIONS handler for all /api/* routes - * This must be registered before any other routes to catch preflight requests */ router.options('/api/*', handleCorsHeaders); /** - * Specific route: POST /api/datasets/createempty + * POST /api/datasets/createempty + * Clowder v1: POST /api/datasets/createempty with {name, description, space} + * Clowder2: POST /api/v2/datasets with {name, description, ...} - different body schema */ router.post('/api/datasets/createempty', handleCorsHeaders, handleCorsAndUpload, async function (req, res) { - await proxyToClowder(req, res, '/datasets/createempty'); + if (isClowder2()) { + const body = req.body || {}; + const licenseId = process.env.CLOWDER_DEFAULT_LICENSE_ID || ''; + const datasetBody = { + name: body.name || 'Untitled Dataset', + description: body.description || '', + status: body.status || 'PRIVATE' + }; + const queryString = licenseId ? `?license_id=${encodeURIComponent(licenseId)}` : ''; + await proxyToClowder(req, res, '/datasets', queryString, { + method: 'POST', + body: datasetBody + }); + } else { + await proxyToClowder(req, res, '/datasets/createempty'); + } }); /** - * Specific route: POST /api/uploadToDataset/:datasetId - * Adds extract=false query parameter by default + * POST /api/uploadToDataset/:datasetId + * Clowder v1: POST /api/uploadToDataset/:id + * Clowder2: POST /api/v2/datasets/:id/files (form field "file" instead of "File") */ router.post('/api/uploadToDataset/:datasetId', handleCorsHeaders, handleCorsAndUpload, async function (req, res) { - // Build query string with extract=false, but allow client to override const existingQuery = req.url.includes('?') ? req.url.substring(req.url.indexOf('?') + 1) : ''; const queryParams = new URLSearchParams(existingQuery); if (!queryParams.has('extract')) { @@ -295,45 +308,203 @@ router.post('/api/uploadToDataset/:datasetId', handleCorsHeaders, handleCorsAndU } const queryString = queryParams.toString(); - await proxyToClowder(req, res, `/uploadToDataset/${req.params.datasetId}`, queryString); + if (isClowder2()) { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/files`, queryString ? `?${queryString}` : null); + } else { + await proxyToClowder(req, res, `/uploadToDataset/${req.params.datasetId}`, queryString); + } }); /** - * Specific route: POST /api/files/:fileId/extractions - * This route handles JSON requests for triggering extractions + * POST /api/files/:fileId/extractions + * Clowder v1: POST /api/files/:id/extractions with body {extractor, parameters} + * Clowder2: POST /api/v2/files/:id/extract?extractorName=X with body {parameters} */ router.post('/api/files/:fileId/extractions', handleCorsHeaders, async function (req, res) { - console.log('POST /api/files/:fileId/extractions route hit', req.params.fileId); - await proxyToClowder(req, res, `/files/${req.params.fileId}/extractions`); + if (isClowder2()) { + const body = req.body || {}; + const extractorName = body.extractor || ''; + const parameters = body.parameters || {}; + const queryString = extractorName ? `?extractorName=${encodeURIComponent(extractorName)}` : ''; + await proxyToClowder(req, res, `/files/${req.params.fileId}/extract`, queryString, { + method: 'POST', + body: parameters + }); + } else { + await proxyToClowder(req, res, `/files/${req.params.fileId}/extractions`); + } }); /** - * Specific route: GET /api/files/:fileId/blob + * GET /api/files/:fileId/blob + * Clowder v1: GET /api/files/:id/blob + * Clowder2: GET /api/v2/files/:id (no /blob) */ router.get('/api/files/:fileId/blob', handleCorsHeaders, async function (req, res) { - await proxyToClowder(req, res, `/files/${req.params.fileId}/blob`); + if (isClowder2()) { + await proxyToClowder(req, res, `/files/${req.params.fileId}`); + } else { + await proxyToClowder(req, res, `/files/${req.params.fileId}/blob`); + } }); /** - * Specific route: GET /api/datasets/:datasetId/listFiles + * GET /api/datasets/:datasetId/listFiles + * Clowder v1: GET /api/datasets/:id/listFiles -> object keyed by file id + * Clowder2: GET /api/v2/datasets/:id/files -> Paged {items, total}; transform to v1-compatible format */ router.get('/api/datasets/:datasetId/listFiles', handleCorsHeaders, async function (req, res) { - await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/listFiles`); + if (isClowder2()) { + setCorsHeaders(req, res); + try { + const CLOWDER_REMOTE_HOSTNAME = process.env.CLOWDER_REMOTE_HOSTNAME; + const APIKEY = process.env.APIKEY; + const PREFIX = process.env.CLOWDER_PREFIX || ''; + if (!CLOWDER_REMOTE_HOSTNAME || !APIKEY) { + return res.status(500).json({ error: 'Server configuration error' }); + } + const url = `${CLOWDER_REMOTE_HOSTNAME}${PREFIX}/api/v2/datasets/${req.params.datasetId}/files`; + const response = await fetch(url, { + headers: { 'X-API-Key': APIKEY, 'Accept': 'application/json' } + }); + const data = await response.json(); + // Transform Paged {items} to v1 format: object keyed by file id + console.log('Clowder2 files response:', JSON.stringify(data, null, 2)); + if (data.data && Array.isArray(data.data)) { + const fileMap = {}; + for (const file of data.data) { + const id = file.id || file._id; + if (id) { + const ct = file.content_type; + const contentType = typeof ct === 'string' ? ct : (ct && ct.content_type) || 'application/octet-stream'; + fileMap[id] = { + id, + filename: file.name || file.filename, + 'contentType': contentType, + 'date-created': file.created, + size: String(file.bytes || 0) + }; + } + } + return res.status(response.status).json(fileMap); + } + return res.status(response.status).json(data); + } catch (error) { + console.error('ListFiles proxy error:', error); + setCorsHeaders(req, res); + return res.status(500).json({ error: 'Proxy request failed', message: error.message }); + } + } else { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/listFiles`); + } }); /** - * Specific route: GET /api/datasets/:datasetId/metadata.jsonld + * GET /api/datasets/:datasetId/metadata.jsonld + * Clowder v1: GET /api/datasets/:id/metadata.jsonld + * Clowder2: GET /api/v2/datasets/:id/metadata */ router.get('/api/datasets/:datasetId/metadata.jsonld', handleCorsHeaders, async function (req, res) { - await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/metadata.jsonld`); + if (isClowder2()) { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/metadata`); + } else { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/metadata.jsonld`); + } }); /** - * Specific route: GET /api/files/:fileId/getPreviews + * POST /api/datasets/:datasetId/usermetadatajson + * Clowder v1: POST /api/datasets/:id/usermetadatajson + * Clowder2: POST /api/v2/datasets/:id/metadata + */ +router.post('/api/datasets/:datasetId/usermetadatajson', handleCorsHeaders, async function (req, res) { + if (isClowder2()) { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/metadata`); + } else { + await proxyToClowder(req, res, `/datasets/${req.params.datasetId}/usermetadatajson`); + } +}); + +/** + * GET /api/files/:fileId/getPreviews + * Clowder v1: GET /api/files/:id/getPreviews + * Clowder2: GET /api/v2/visualizations/{resource_id}/config - different API + * For now proxy to same path; Clowder2 may return 404 if previews not available */ router.get('/api/files/:fileId/getPreviews', handleCorsHeaders, async function (req, res) { - await proxyToClowder(req, res, `/files/${req.params.fileId}/getPreviews`); + if (isClowder2()) { + // Clowder2 uses visualizations; try /api/v2/visualizations/{file_id}/config + // If resource_id is file_id, this may work + await proxyToClowder(req, res, `/visualizations/${req.params.fileId}/config`); + } else { + await proxyToClowder(req, res, `/files/${req.params.fileId}/getPreviews`); + } }); -module.exports = router; +/** + * GET /api/extractions/:fileId/statuses + * Clowder v1: GET /api/extractions/:id/statuses -> {Status, extractorName: "DONE"} + * Clowder2: GET /api/v2/jobs?file_id=:id -> Paged with items; transform to v1 format + */ +router.get('/api/extractions/:fileId/statuses', handleCorsHeaders, async function (req, res) { + if (isClowder2()) { + setCorsHeaders(req, res); + try { + const CLOWDER_REMOTE_HOSTNAME = process.env.CLOWDER_REMOTE_HOSTNAME; + const APIKEY = process.env.APIKEY; + const PREFIX = process.env.CLOWDER_PREFIX || ''; + if (!CLOWDER_REMOTE_HOSTNAME || !APIKEY) { + return res.status(500).json({ error: 'Server configuration error' }); + } + const url = `${CLOWDER_REMOTE_HOSTNAME}${PREFIX}/api/v2/jobs?file_id=${req.params.fileId}`; + const response = await fetch(url, { + headers: { 'X-API-Key': APIKEY, 'Accept': 'application/json' } + }); + const data = await response.json(); + // Transform Clowder2 Paged jobs to v1 format: {Status, listener_id: "DONE"|"Processing"} + const result = { Status: 'Done' }; + if (data.items && Array.isArray(data.items)) { + let anyProcessing = false; + for (const job of data.items) { + const listenerId = job.listener_id || job.id; + const status = (job.status || 'CREATED').toUpperCase(); + const done = status === 'DONE' || status === 'COMPLETED'; + result[listenerId] = done ? 'DONE' : 'Processing'; + if (!done) anyProcessing = true; + } + if (anyProcessing) result.Status = 'Processing'; + } + return res.status(response.status).json(result); + } catch (error) { + console.error('Extraction status proxy error:', error); + setCorsHeaders(req, res); + return res.status(500).json({ error: 'Proxy request failed', message: error.message }); + } + } else { + await proxyToClowder(req, res, `/extractions/${req.params.fileId}/statuses`); + } +}); + +/** + * GET /api/thumbnails/:thumbnailId/blob + * Clowder v1: GET /api/thumbnails/:id/blob + * Clowder2: GET /api/v2/thumbnails/:id (no /blob) + */ +router.get('/api/thumbnails/:thumbnailId/blob', handleCorsHeaders, async function (req, res) { + if (isClowder2()) { + await proxyToClowder(req, res, `/thumbnails/${req.params.thumbnailId}`); + } else { + await proxyToClowder(req, res, `/thumbnails/${req.params.thumbnailId}/blob`); + } +}); +/** + * Catch-all for all other /api/* routes - proxy with path prefix + * Handles routes like /api/datasets, /api/datasets/:id, /api/files/:id/metadata, etc. + */ +router.all('/api/*', handleCorsHeaders, handleCorsAndUpload, async function (req, res) { + const targetPath = req.path.replace(/^\/api/, '') || '/'; + await proxyToClowder(req, res, targetPath); +}); + +module.exports = router; diff --git a/src/utils/dataset.js b/src/utils/dataset.js index 80bb307..97b461c 100644 --- a/src/utils/dataset.js +++ b/src/utils/dataset.js @@ -11,31 +11,31 @@ const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms)); export function getServerUrl(relativePath) { const serverUrl = process.env.SERVER_URL || ""; const serverPort = process.env.SERVER_PORT || ""; - + // If no server URL is configured, return relative path (works when frontend and backend are on same origin) // This is the preferred approach for development when both run on the same port or when using a proxy if (!serverUrl || serverUrl.trim() === "") { return relativePath; } - + // Construct base URL let baseUrl = serverUrl.trim(); - + // Add protocol if missing if (!baseUrl.startsWith("http://") && !baseUrl.startsWith("https://")) { baseUrl = `http://${baseUrl}`; } - + // Add port if specified and not already in URL if (serverPort && serverPort.trim() !== "" && !baseUrl.includes(`:${serverPort.trim()}`) && !baseUrl.match(/:\d+$/)) { // Remove trailing slash from baseUrl before appending port baseUrl = baseUrl.replace(/\/$/, ""); baseUrl = `${baseUrl}:${serverPort.trim()}`; } - + // Ensure relativePath starts with / const path = relativePath.startsWith("/") ? relativePath : `/${relativePath}`; - + return `${baseUrl}${path}`; } diff --git a/src/utils/file.js b/src/utils/file.js index e4bd0ad..1e39734 100644 --- a/src/utils/file.js +++ b/src/utils/file.js @@ -16,10 +16,10 @@ export async function submitForExtraction(file_id, extractor_name, statementType else{ body = {"extractor": extractor_name}; } - + const extraction_response = await extractionRequest(file_id, body); console.log("Extraction response for extractor ", extractor_name, extraction_response); - if (extraction_response !== null && extraction_response.status === "OK") { + if (extraction_response != null && extraction_response.status === "OK") { return true; } else { @@ -50,18 +50,15 @@ async function extractionRequest(file_id, body_data) { //const extraction_response_text = await response.text(); //console.log(extraction_response_text); if (response.status === 200) { - // return {"status":"OK","job_id":"string"} + return { status: "OK", job_id: extraction_response }; } else if (response.status === 409){ // TODO handle error await sleep(30000); - await extractionRequest_loop(); - } - else { + return await extractionRequest_loop(); + } else { // TODO handle error - extraction_response.status = "FAIL"; + return { status: "FAIL", job_id: extraction_response }; } - return extraction_response; - }; extraction_response = await extractionRequest_loop(); @@ -243,7 +240,7 @@ export async function getPreviewResources(fileId, preview) { const preview_config = {}; //console.log(preview); {p_id:"HTML", p_main:"html-iframe.js", p_path:"/assets/javascripts/previewers/html", pv_contenttype:"text/html", pv_id:"64ac2c9ae4b024bdd77bbfb1",pv_length:"52434",pv_route:"/files/64ac2c9ae4b024bdd77bbfb1/blob"} //{"pv_route": "/clowder/api/previews/67224c2ae4b095dc59cb5fde","p_main": "thumbnail-previewer.js","pv_id": "67224c2ae4b095dc59cb5fde","p_path": "/clowder/assets/javascripts/previewers/thumbnail","p_id": "Thumbnail","pv_length": "157049","pv_contenttype": "image/png"} - + preview_config.previewType = preview["p_id"].replace(" ", "-").toLowerCase(); // html if (preview_config.previewType === "thumbnail") { @@ -276,7 +273,7 @@ export async function getPreviewResources(fileId, preview) { preview_config.fileid = preview["pv_id"]; preview_config.previewer = `/public${preview["p_path"]}/`; preview_config.fileType = preview["pv_contenttype"]; - + // Handle preview resource URL let pv_routes = preview["pv_route"]; // pv_route:"/files/64ac2c9ae4b024bdd77bbfb1/blob" if (!pv_routes.includes("/api/")) {