Hacker Newsnew | past | comments | ask | show | jobs | submit | daniel31x13's commentslogin

I maintain an open-source project called Linkwarden and this exact discussion is one of the reasons why it exists, teams needed a way to preserve referenced URLs reliably without having to depend on external services.

It stores webpages in multiple formats (HTML snapshot, screenshot, PDF snapshot, and a fully dedicated reader view) so you’re not relying on a single fragile archive method.

There’s both a hosted cloud plan [1] which directly supports the project, and a fully self-hosted option [2], depending on how much control you need over storage and retention.

[1]: https://linkwarden.app

[2]: https://github.com/linkwarden/linkwarden


Linkwarden is awesome and with the singlefile extension it's pretty easy to store things you can see but the scraper gets blocked on.

One question, what's your stance on adding a way to mark articles as read or "archive" them like other apps that are branded a bit more as storing things to read later. You can technically do something similar with tags but it's a bit clunky of a UX.


Thanks! At the moment we’re focused on archiving rather than read-later workflows, but this is great feedback. I’ve already added it to the feature requests list.


Archival is one side of the coin, but consumption as-in read-later is very important as well.

I am currently evaluating Linkwarden, Wallabag, Hoarder, Linkding and each of the services has pro and cons making it hard for me to choose one. Linkwarden is AWESOME in its way to store content in multiple formats, but the read-later wfs could be improved.

Without checking again: does Linkwarden sync reading location across devices and automatically scrolls to that location on the next device? Does it tell me how „long“ an article takes to read (solely based on the length of it)? Does Linkding support marking up text and persist (mark some text yellow and see those marks somewhere or even add comments or favorite specific parts of texts).

No need to answer any of the questions, I can research myself, just putting these out there for a read-later solution I would like. Add a link on my mobile device, Linkwarden could do its magic in the backend, and I check out the content later on desktop or even on my mobile device.


> with the singlefile extension it's pretty easy to store things you can see but the scraper gets blocked on

FWIW, at least on iOS, it's possible to inject Javascript into the web site being currently displayed by Safari as a side effect of sharing a web link to an app via the share sheet.

Several "read it later" style apps use this successfully to get around paywalls (assuming you've paid yourself) and other robot blockers. Any plans for Linkwarden to do this (or does it already)?


Any docs on this? I didn't know this was a thing.


I believe the key search term was NSExtensionJavaScriptPreprocessingFile, e.g. documented here: https://developer.apple.com/library/archive/documentation/Ge...



That's cool, but also requires using the Singlefile extension (and granting it access), right?

What I like about the share sheet JS method is that it doesn't get access to most of my browsing sessions. (The shared-to app getting access to my browser session is somewhat unexpected, though.)


Neat. How does the archive.org integration works?

Does it just POST the url to them for them to fetch? Or is there any integration/trust to store what you already fetched on the client directly on their archives?


> Does it just POST the url to them for them to fetch?

Correct.


I literally just came across and installed your project on my server today. It's fantastic and with it I was able to cancel my readwise subscription. Great work!




If you review the project’s history, you’ll notice that Linkwarden was started well before the vibe-coding hype, and even before the existence of ChatGPT.


Sad to see Pocket shutting down. For anyone seeking alternatives, I've been working on Linkwarden[1], an open-source bookmark manager that hit HN's front page twice[2][3]. Plenty of other great alternatives out there too, such as Raindrop (not fully open-source), Karakeep, and Wallabag.

Also, there's an official Linkwarden mobile app in development, aiming to support most (if not all) of Pocket's key features :)

[1]: https://linkwarden.app

[2]: https://news.ycombinator.com/item?id=36942308

[3]: https://news.ycombinator.com/item?id=43856801


Yes, you can import any kind of bookmarks html files.

There are also other importing formats we do support as well like Wallabag, Omnivore, etc…


We’re working on an official mobile app[1], which will most likely include this feature sometime after its launch :)

[1]: https://github.com/linkwarden/linkwarden/issues/246#issuecom...


An official app with that sounds great! From what you know, would it be possible to also have offline support with the PWA?


Will the offline mode work on laptops?


Hello everyone, I’m the main developer behind Linkwarden. Glad to see it getting some attention here!

Some key features of the app (at the moment):

- Text highlighting

- Full page archival

- Full content search

- Optional local AI tagging

- Sync with browser (using Floccus)

- Collaborative

Also, for anyone wondering, all features from the cloud plan are available to self-hosted users :)


Cool, looks like text highlighting is a new addition in 2.10. There aren't any examples in the demo site of this, but can it capture the highlighted text snippets and show them in the link details page? That would help me recall quickly why I saved the link, without opening the original link and re-reading the page. I haven't really seen this in other tools (or maybe I just haven't looked hard enough), except Memex.


> There aren't any examples in the demo site of this

This is because we haven't updated the demo to the latest version.

> but can it capture the highlighted text snippets and show them in the link details page?

That's a good idea that we might implement later, but at the moment you can only highlight the links[1].

[1]: https://blog.linkwarden.app/releases/2.10#%EF%B8%8F-text-hig...


> “…can it capture the highlighted text snippets and show them in the link details page.”

Essentially a quote with attribution.


Great product! Does it handle special metadata like https://mymind.com/ does, eg. showing prices directly in the UI if the saved link is a product in a shop? If not, things like that would be a great addition!


Site note: When a website advertising a product does a bad job at optimising the loading of the page, that's usually a red flag for me; yes that website has noticeable jitter when scrolling up and down even though it _only_ load around ~70Mb worth of assets initially.


(The historical price on the day the link was published, or the current price, or over a date range, or configurable? I see different use-cases)


I'd be interested to hear your thoughts on having a PWA vs regular mobile apps since it looks like you started with a PWA, but are moving to regular apps. Is that just a demand / eyeballs thing or were there technical reasons?


Mostly the UX it provides. PWAs are a quick and easy way to support mobile but the UX is nowhere near as good a traditional mobile app…


I have about ~30k .webarchive files — is there a chance to import them?


Even if importing them they might remain stuck in some import queue and you might not be able to search them. That was a blocker for me https://github.com/linkwarden/linkwarden/issues/586


Suggestion/request:

What I'd really love is a super compact "short-name only" view of links. Just words, not lines or galleries. For super-high content views.



Ahh, yes, you can reduce it to names with a lot of columns. In my personal ideal, I've love to store a short-name for a link and have no boxes. Personally, I've always wanted links to be like the tag cloud in pinboard and to have a page with multiple tags/categories.

I'd also love a separation of human tags and AI tags (even by base or stem), just in case they provided radically different views, but both were useful.

EDIT: Just did a quick look in the documentation, is there a native or supported distinction between links that are like bookmarks and links that are more content/articles/resources?


Could still be a lot more compact. Would also like the hierarchical view in the main pane.

In any case, nice project, thank you.


Came here to ask for exactly this.


> Full page archival

Does it grab the DOM from my browser as it sees it? Or is it a separate request? If so, how does it deal with authentication?


So there are different ways it archives a webpage.

It currently stores the full webpages as a single html file, a screenshot, a pdf, a read-it-later view.

Aside from that, you can also send the webpages to the Wayback Machine to take a snapshot.

To archive pages behind a login or paywall, you can use the browser extension, which captures an image of the webpage in the browser and sends it to the server.


> To archive pages behind a login or paywall, you can use the browser extension, which captures an image of the webpage in the browser and sends it to the server.

Just an image? So no full text search?


> To archive pages behind a login or paywall, you can use the browser extension, which captures an image of the webpage in the browser and sends it to the server.

It'd be awesome to integrate this with the SingleFile extension, which captures any webpage into a self-contained HTML file (with JS, CSS, etc, inlined).


We might add this, it's actually highly suggested by the users :)


How difficult would it be to import an existing list of links/tags? Also, if I were using a hosted version, would I be able to eg insert/retrieve files via an API call?

I ask because currently I use Readwise but have a local script that syncs the reader files to a local DB, which then feeds into some custom agent flows I have going on on the side.


> How difficult would it be to import an existing list of links/tags?

Pretty easy if you have it in a bookmark html file format.

> Also, if I were using a hosted version, would I be able to eg insert/retrieve files via an API call?

Yup, check out the api documentation:

https://docs.linkwarden.app/api/api-introduction


Interesting project! A couple of questions:

- Does the web front end support themes? It’s a trivial thing but based on the screenshots, various things about the default theme bug me and it would be nice to be able to change those without a user style extension.

- Does it have an API that would allow development of a native desktop front end?


> Does the web front end support themes?

Yes[1].

> Does it have an API that would allow development of a native desktop front end?

Also yes[2].

[1]: https://blog.linkwarden.app/releases/2.9#-customizable-theme

[2]: https://docs.linkwarden.app/api/api-introduction


Very very neat!

a question arose for me though: if the AI tagging is self hostable as well, how taxing is it for the hardware, what would the minimum viable hardware be?


Thanks! A lightweight model like the phi3:mini-4k is enough for this feature.[1]

It’s worth mentioning that you can also use external providers like OpenAI and Anthropic to tag the links for you.

[1]: https://docs.linkwarden.app/self-hosting/ai-worker


Curious if the the paid tier helps support development of the project


Definitely! :)


> Optional local AI tagging

https://docs.linkwarden.app/self-hosting/ai-worker

I took a look at this... and you use the Ollama API behind the scenes?? Why not use an OpenAI compatible endpoint like the rest of the industry?

Locking it to Ollama is stupid. Ollama is just a wrapper for llama.cpp anyways. Literally everyone else running LLMs locally- llama.cpp, vllm (which is what the inference providers use, also I know Deepseek API servers use this behind the scenes), LM Studio (for the causal people), etc all use an OpenAI compatible api endpoint. Not to mention OpenAI, Google, Anthropic, Deepseek, Openrouter, etc all mainly use (or at least fully supports, in the case of Google) an OpenAI compatible endpoint.


You could contribute an option!


> Locking it to Ollama is stupid.

If you don’t like this free and open source software that was shared it’s luckily possible to change it yourself…or if it’s not supporting your favorite option you can also just ignore it. No need to call someone’s work or choices stupid.


Strong disagree. Just because something is free and open source does not make it good. Call a spade a spade.

Ollama is a piece of shit software that basically stole the work of llama.cpp, locks down their GGUFs files so it cannot be used by other software on your machine, misleads users by hiding information (like what quant you are using, who produced the GGUF, etc), created their own API endpoint to lock in users instead of using a standard OpenAI compatible API, and more problems.

It's like they looked at all the bad walled garden things Apple does and took it as a todo list.


That’s not the point, you didn’t say “Ollama is stupid” you said “Locking it to Ollama is stupid”.

Not every person is aware of all faults or politics of all their dependencies.


That's an absolutely terrible defense. Ignorance is not an excuse, try telling that to a police officer.

And plus, certain people are held to a higher standard. It's not like I'm expecting a random person on the street to know about Ollama, but someone building AI software is expected to research what they are using and do their due diligence. To plead ignorance is to assert incompetence at best and negligence at worst.


> Research shows 25% of web pages posted between 2013 and 2023 have vanished.

I’ve been personally working on a project over the past year which addresses the exact issue: https://linkwarden.app

An open-source [1] bookmarking tool to collect, organize and preserve contents on the internet.

[1]: https://github.com/linkwarden/linkwarden


Is there a way it could eventually function like a P2P version of archive.org, so that if anyone has a copy of a page (at a point in time I suppose?), it's available to anyone in the network?

If I understand correctly, right now it's more of a self hosted tool for personal archiving (which is great -- I'm a user myself), but something even more resilient harnessing network effects would be great to see.


I’m working on a self-hostable, open-source collaborative bookmark manager: https://linkwarden.app

Github: https://github.com/linkwarden/linkwarden


Woah that’s really cool!


This is one of the main reasons I created Linkwarden - to combat Link-Rot.

Linkwarden is an open-source collaborative bookmark manager to collect, organize and preserve webpages:

https://linkwarden.app


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: