We build AI-powered tools that help researchers and organizations make sense of the scientific literature. Small team, hard problems, real momentum. Our MCP server just launched, enterprise adoption is growing, and the product roadmap is full of ambitious bets.
Infrastructure: AWS (ECS Fargate, RDS, S3, Lambda), Docker, Kong API Gateway, GitHub Actions, CircleCI
Observability: OpenTelemetry, New Relic, Sentry, PostHog
Other: Stripe, SendGrid, Playwright, TypeScript
What we're looking for:
Strong engineering fundamentals
Comfort working across the stack, Python and JavaScript/React in particular
Experience with PostgreSQL, Elasticsearch, or similar data-intensive systems is a plus
Familiarity with LLM APIs and retrieval-augmented generation (RAG) is a bonus
Genuine curiosity about how AI can improve how people interact with research
If you're interested or know someone who might be a great fit, email josh@scite.ai
I'm a co-founder at Scite. We've built an MCP server that lets AI assistants search and read full-text scientific articles.
We have 50M+ full-text articles from licensed publisher agreements. If your institution subscribes to a journal, you can authenticate with your institutional credentials and your AI assistant gets the same access you'd have through your library. No scraping, no workarounds, fully licensed.
For anything outside your subscription, you get metadata, abstracts, and direct linkouts to the publisher page.
It works with Claude, Cursor, and anything that supports MCP.
Scite.ai | Full-Stack Software Engineers | Remote (US/Canada)| Full time | https://scite.ai/
We are hiring for multiple engineering roles at scite.ai to build the worlds best place to search, discover, understand, and access research.
We have unique access to the world's literature through publisher partnerships and and have developed a market leading generative AI Assistant.
We are mostly looking for full stack senior software engineering roles who have experience and interest in scaling up systems and building large scale ingestion and search - based applications. This role requires no previous AI experience or knowledge (though its a plus!) so its perfect for folks who want to break into the generative AI space from an engineering background.
The results in the link are literally how hacker news has been mentioned in scientific papers. Not sure how you consider that clickbait or misleading or why you consider those meaningless.
Meta (a great name) was a very promising startup a few years back in scholarly publishing. It had successfully analyzed millions of articles to help with discovery of research. It's a shame it is being shuttered but there are a lot of tools out there now doing similar things and in many cases much much more:
So far as I understand it's a lot more than that. Their model has not only indexed but has also individually classified 900M+ citation statements by sentiment, the outputs of which produce a score to represent each paper's relative trustworthiness.
To your point though it's fun to think about generative applications too. I for one would appreciate a writing assistant trained on millions of scientific papers -- like OK I'll write that last-minute proposal for you but you better believe it's going to be chock-full of dispensable lexical arcana.
We (scite.ai) have extracted 918M citation statements (3 full sentences) from 27M full-text articles (more than half are from paywalled articles from indexing agreements).
And, recently, have made these citation statements easily searchable to find expert analyses and opinions on nearly any topic extracted from the literature: https://scite.ai/search/citations
We can't release the full dataset as our licensing agreements with publishers restrict it. We do have an API though that can be used: https://api.scite.ai/docs
Nice work on Scite,
I'm not sure if this is a different use case than I have, but my searches are listing duplicates of the same papers many times. Is there a reason to not collapse duplicates into a single result?
Is this on Citation Statement search? I think it is probably because the citation context contains two or more citations in it. We look at citations per sentence so it is duplicated there. I can see how that is confusing though.
Hi! I checked out scite and really like the product- however, I can't seem to find relevant papers in my field (ML&AI), ie., I searched for the batchnorm paper, which there has been a lot of 'reevaluation' of- but it's not available.
I suppose at this point you don't have access to (some of) the big AI journals/conferences. Is this something on the horizon?
We build AI-powered tools that help researchers and organizations make sense of the scientific literature. Small team, hard problems, real momentum. Our MCP server just launched, enterprise adoption is growing, and the product roadmap is full of ambitious bets.
Stack:
Frontend: React 18, Redux/Redux Saga, Webpack, SWC, Express (SSR), CSS Modules Backend: FastAPI (Python 3.11), Flask, PostgreSQL, Elasticsearch, Redis, Celery AI/ML: Claude (Anthropic), OpenAI, PyTorch, Hugging Face Transformers, sentence-transformers, RAG pipelines
Infrastructure: AWS (ECS Fargate, RDS, S3, Lambda), Docker, Kong API Gateway, GitHub Actions, CircleCI
Observability: OpenTelemetry, New Relic, Sentry, PostHog
Other: Stripe, SendGrid, Playwright, TypeScript
What we're looking for:
Strong engineering fundamentals Comfort working across the stack, Python and JavaScript/React in particular Experience with PostgreSQL, Elasticsearch, or similar data-intensive systems is a plus
Familiarity with LLM APIs and retrieval-augmented generation (RAG) is a bonus Genuine curiosity about how AI can improve how people interact with research
If you're interested or know someone who might be a great fit, email josh@scite.ai
reply