Hacker Newsnew | past | comments | ask | show | jobs | submit | Paul_S's commentslogin

The speed of improvement of tts models reminds me of early days of Stable Diffusion. Can't wait until I can generate audiobooks without infinite pain. If I was an investor I'd short Audible.

An all-TTS audiobook offering is just about as appealing as an all-stable-diffusion picture gallery (that is, not at all).

Isn’t it more like an art gallery of prints of paintings? The primary art is the text of the book (like the painting in the gallery), TTS (and printing a copy) are just methods of making the art available.

I think it can be argued that audiobook's add to the art by adding tone and inflection by the reader.

To me, what you're saying is the same as saying the art of a movie is in the script, the video is just the method of making it available. And I don't think that's a valid take


No, that's an incorrect analogy. The script of a movie is an intermediate step in the production process of a movie. It's generally not meant to be seen by any audiences. The script for example doesn't contain any cinematography or any soundtrack or any performances by actors. Meanwhile, a written work is a complete expressive work ready for consumption. It doesn't contain a voice, but that's because the intention is for the reader to interpret the voice into it. A voice actor can do that, but that's just an interpretation of the work. It's not one-to-one, but it's not unlike someone sitting next to you in the theater and telling you what they think a scene means.

So yes, I mostly agree with GP. An audiobook is a different rendering of the same subject. The content is in the text, regardless of whether it's delivered in written or oral form.


There already are audiobooks on audible that are 100% TTS, while it's playable, it's no substitute (yet) for a real human.

It's just too flat/dead compared to a human reader.


It's not perfect, but I already have a setup for doing this on my phone. Add SherpaTTS and Librera Reader to your phone. (both available free on fdroid).

Set up SherpaTTS as the voice model for your phone (I like the en_GB-jenny_dioco-medium voice option, but there are several to choose from). Add a ebook to librera reader and open it. There's an icon with a little person wearing headphones, which lets you send the text continuously to your phone's tts, using just local processing on the phone. I don't have the latest phone but mine is able to process it faster than the audio is read, so the audio doesn't stop and start.

The voice isn't totally human sounding, but it's a lot better than the microsoft sam days, and once you get used to it the roboticness fades into the background and I can just listen to the story. You may get better results with kokoro (I couldn't get it running on my phone) or similar tts engines and a more powerful phone.

One thing I like about this setup is that if you want to swap back and forth between audio and text, you can. The reader scrolls automatically as it makes the audio, and you can pause it, read in silence for a while yourself and later set it going from a new point.


I feel like TTS is one of the areas that as evolved the least. Small TTS models have been around for like 5+ years and they've only gotten incrementally better. Giants like ElevenLabs make good sounding TTS but it's not quite human yet and the improvements get less and less each iteration.

I've moved to https://github.com/readest/readest over audio books in most cases. I just need the dang thing in my ears and their TTS is good enough.

Wouldn't audible be perfectly positioned to take advantage of this. They have the perfect setup to integrate this into their offering.

It seems more likely that people will buy a digital copy of the book for a few bucks and then run the TTS themselves on devices they already own.

Not likely at all, people pay for convenience. They don't want to do that

Yeah hackernews users kept thinking the average consumers like to tinker like we do lol

eBooks are much more expensive then an Audible subscription though.

I wouldn't say so. Audible gives you 1 book a month for $15. Most e-books I see are around $10.

All the models I tried have similar problems. When trying to batch a whole audiobook, the only way is to run it, then run a model to transcribe and check you get the same text.

Most drivers here don't adjust the light angle based on loading of the car (if you have passengers in the back in a small car that is enough to move your lights completely out of whack). I assume it's just laziness or the a "not my problem" attitude but some drivers I have spoken to didn't know that lights are adjustable!


How about making the arseholes who dumped that rubbish clean it up.


There are traffic management cameras literally everywhere in the UK. It's impossible to drive up to that spot without passing by hundreds of them. The fact that they didn't charge anyone speaks volumes about the level of incompetence.


I read the article and I really don't understand. Linux already buffers files into RAM if there's any unused, why would you do this?


The "so you wont need to read files from disk" argument is bullshit because tmpfs data can be evicted to swap. If memory pressure is high you will still be reading from disk.

And high memory pressure is also what makes disk-backed /tmp slow. No improvement at all.


I ran my own NAS for over two decades in some old 4U I got for cheap, using whatever discarded consumer HW I got for free and I never got the point of Synology. Colleague who has one said it's compact. Well, this year I bought one of those gaming cube cases (with space for 10 drives, what do people do with them in gaming pcs? OK, only 8 spaces are actually drawers with grommets but physically you can fit 10) and retired my 4U.

Seriously, takes an hour to setup your own NAS and you can mix any drives, setup any encryption you want, seedbox etc. I totally understand convenience but this is not a email server you're setting up here, it's just a NAS.


I did something similar years ago. Couple of drives in an old beige tower case. Setup the sharing wand what not. Not exactly 'hard' but it was one thing. Time consuming. Once you have done it a few times the novelty wears off and its more of a chore to mess around with the thing. NAS boxes like that 'just work'. You plug some drives in, set it up, done. However, one comment in here puts it perfectly. The software on syno is wildly out of date. It has been for 15 years. Ease of use is now outweighed by something recent software wise. The syno guys are literally leaving nearm 20-30% perf uplift out. For 'reasons'. Those reasons are wearing very thin. It will mean I need a different backup solution for my computers. One that handles full disk incremental and stored windows and linux on the remote drive and not something I 'run once and awhile' and perferably open source.


20 years ago it was a chore but nowadays it's faster than baking a cake. 10 minutes prep time (configure os and add drives), 10 minutes bake time (installation) (or 10+ hours bake time if count building the array)

But let's assume you don't have a clue and have to follow some tutorial and do some reading and it takes you 2 hours. That's amortised across a decade. Especially now when easy distro upgrades are basically unattended so you can use the same setup for a decade and stay up to date.


I'm interested in starting out like this, I have a bunch of 2.5" SSD's I'm not using- do you have any tips on what cube to get? Are you concerned about power usage at all especially if this is always on?


Any of those cube gaming ones I think are great. I got a dual chamber one which makes shuffling drives and cabling easy. Can't remember the name but it was 90 pounds, way more than I paid for the old 4U, although inflation from the 90s probably means it was more expensive in real terms. Most of the power is used to spin rust so not sure it's worth worrying about the HW power use, just use whatever old pc you can get for free, ask colleagues and family, people throw out working PCs all the time, it's a NAS, not a rendering farm, if it boots it's good enough IMHO.


Found the name: Fractal node 804.


Great- thanks for this.


Since the server is overloaded apparently and people can't see what this is without connecting:

  Connection to clearsky.dev (155.133.22.147) 29438 port [tcp/*] succeeded!
  AXR2KPxr2000.leo.spacenet:send a single 0 byte followed by 'XR2K' for documentation.
which results in:

  SPACE TERMINAL INTERFACE PROTOCOL
  =================================
  The XR-2000 space terminal is a communications satellite that is used to communicate with spacecraft.

  This protocol allows a user to remotely use the XR-2000 space terminal to send messages to and receive messages from spcecraft using the global space communication network.

  PACKET STRUCTURE
  ----------------
  Packets are sent over a TCP connection that is established by the client to the terminal on port 29438.

  packet header: 
  * 2 bits length field length (LFL).
    - 0: packet has no payload. 
    - 1: 1 byte length field
    - 2: 2 byte length field
    - 3: 4 byte length field
  \* 1 bit: request id field present
  \* 5 bits packet type.
  \* 1 byte request id (if present)
  \* 4 bytes magic
  \* 0, 1, 2 or 4 bytes payload length
  \* 0..n bytes payload

  all integers are little endian and unsigned unless indicated otherwise.

  The request id field can be used by the client to correlate requests with their responses. The value may be chosen by the client. If the field is present in a request packet, it will be copied into the response by the terminal.

  The magic bytes contain the ascii text: XR2K

  PACKET TYPES
  ------------
  The following packet types are defined:

  \* 0x00 help           client -> terminal
  \* 0x01 hello          terminal -> client
  \* 0x02 documentation  terminal -> client
  \* 0x03 register       client -> terminal
  \* 0x04 registered     terminal -> client
  \* 0x05 login          client -> terminal
  \* 0x07 getstatus      client -> terminal
  \* 0x08 status         terminal -> client
  \* 0x09 getmail        client -> terminal
  \* 0x0a mail           terminal -> client
  \* 0x0b sendmail       client -> terminal
  \* 0x12 configure      client -> terminal
  \* 0x14 route          both
  \* 0x15 translate      client -> terminal
  \* 0x16 translation    terminal -> client
  \* 0x1f result         terminal -> client

  0x00 HELP PACKET
  -------------------------
  This packet is used by the client to request the protocol documentation. The terminal will respond with a DOCUMENTATION (0x02) packet.

  This packet has no contents, which means the Length-Field-Length is 0. The request id field can also be disabled. So the client only needs to send a single 0-byte to obtain the documentation.

  0x01 HELLO PACKET
  -----------------
  This packet is sent by the terminal inmediately after a new TCP connection has been established.

  Packet payload:
  \* 1 byte: protocol version. The only defined version is 1
  \* 1 byte: terminal hostname length
  \* 0..255 bytes: terminal hostname
  \* 1 byte: documentation instruction length
  \* 0..255 bytes: documentation instructions

  0x02 DOCUMENTATION PACKET
  -------------------------
  This packet is sent by the terminal to provide the this document, the protocol spec, to the client.

  The contents of this packet is the protocol spec.

  0x03 REGISTER PACKET
  --------------------
  This packet is sent by the client to create a new user account. This packet is only valid if the user is not yet authenticated.

  This packet has no payload.

  If registration succeeds, the terminal will respond with an REGISTERED (0x04) packet, which contains the username and password of the created user. The client should then save these values so that it can LOGIN later.

  If registration fails, the terminal will respond with an RESULT (0x1f) packet.

  Possible errors:
  \* 0x11 registration rate limiting: too many accounts were registered from this IP. 
  \* 0x1 already authenticated

  0x04 REGISTERED
  ---------------
  This packet is sent by the terminal to the client to provide the credentials to the newly created account.

  Packet contents:
  \* 1 byte: username length
  \* 0..255 bytes: username
  \* 1 byte: password length
  \* 0..255: password length

  0x05 LOGIN PACKET
  -----------------
  This packet is sent by the client to authenticate using credentials that have been prevuously obtained from the REGISTERED message. This packet is only valid if the user is not yet authenticated.

  Packet contents:
  \* 1 byte: username length
  \* 0..255 bytes: username
  \* 1 byte: password length
  \* 0..255: password length

  If authentication succeeds, the terminal will respond with a RESULT (0x1f) packet with error type 0. Additionally, a STATUS (0x08) packet will be sent to reflect the authenticated status.

  If authentication fails, the terminal will send a RESULT (0x1f) message with the appropriate error type.

  Possible errors:
  \* 0x1 already authenticated
  \* 0x3 invalid credentials

  0x07 GETSTATUS PACKET
  ---------------------
  This packet is used by the client to request a STATUS (0x08) packet.

  This packet has no payload.

  This request will always succeed, no errors are defined.

  0x08 STATUS PACKET
  ------------------
  This packet is sent by the terminal either as a response to a GETSTATUS (0x08) packet or because of a specific event such as logging in or a new mail message arriving.

  If the user is not authenticated, the number of emails field will be -1 (0xffffffff)

  Packet contents:
  \* 4 bytes: number of mails
  \* 4 bytes: connection time (in seconds)
  \* 1 bit: authenticated
  \* 1 bit: authorized for tranceiver usage
  \* 1 bit: tranceiver configured
  \* 5 bits: undefined

  0x09 GETMAIL PACKET
  -------------------
  This packet is used by the client to request the contents of a mail. The XR-2000 mail system is very simple. Each user has a mailbox. Incoming emails receive an id starting with 1. Mails can not be editted or deleted. Sent emails are not stored in the sender's mailbox. The mail system is only internal and not connected to the internet email system.

  This request is only valid when the user is authenticated.

  If the referenced mail is found, the terminal will respond with a MAIL (0x0a) packet. If not, the terminal will send a RESULT (0x1f) packet indicating the appropriate error type.

  Packet contents:
  \* 4 bytes: mail id

  Possible errors:
  \* 0x02 not authenticated
  \* 0x40 mail not found

  0x0a MAIL PACKET
  ----------------
  This packet is the response to a GETMAIL (0x09) packet and contains an email metadata and full contents.

  Packet contents
  \* 4 bytes: mail id
  \* 4 bytes: timestamp (unix)
  \* 1 byte: sender length
  \* 0..255 bytes: sender username
  \* 4 bytes: content length
  \* 0..n bytes: contents

  0x0b SENDMAIL PACKET
  --------------------
  This packet is used by the client to send a mail to another user.

  If the mail is sent successfully, the termimal will respond with a RESULT (0x1f) packet with error type 0. If an error occurred, the error type will be set to the appropriate value.

  Packet contents
  \* 1 byte: recipient length
  \* 0..255 bytes: recipient username
  \* 4 bytes: content length
  \* 0..n bytes: contents

  Possible errors:
  \* 0x02 not authenticated
  \* 0x41 recipient username not found

  0x12 CONFIGURE PACKET
  ---------------------
  This packet is used by the client to configure the XR2000 tranceiver. This packet is only valid of the user is authenticated.

  Packet contents:
  \* 4 bytes: frequency (in kHz)
  \* 4 bytes: baudrate (in bps)
  \* 1 byte: modulation (see below)

  Modulation types:
  \* 0x00 Amplitude Modulation (AM)
  \* 0x01 Frequency Modulation (FM)
  \* 0x02 Phase Modulation (PM)
  \* 0x03 Binary Phase Shift Keying (BPSK)

  If configuration succeeds, the terminal will respond with a RESULT (0x1f) packet with error type 0.

  If configuration fails, the terminal will send a RESULT (0x1f) message with the appropriate error type.

  Possible errors:
  \* 0x02 not authenticated
  \* 0x04 not authorized for tranceiver usage
  \* 0x20 tranceiver malfunction
  \* 0x21 invalid config parameter

  0x14 ROUTE PACKET
  -----------------
  This packet is used by both sides to transport data to and from the spacecraft. It is only valid when the user is authenticated and the tranceiver is configured.

  If the terminal can send the packet out to the tranceiver, Route packets have no response, there is no guarantee that the data is received by the spacecraft.

  The packet contents are the bytes that are sent to or from the spacecraft without any additional headers.

  Possible errors:
  \* 0x02 not authenticated
  \* 0x04 not authorized for tranceiver usage
  \* 0x24 tranceiver not configured
  \* 0x25 tranceiver malfunction

  0x15 TRANSLATE PACKET
  ---------------------
  This packet is used by the client to use the Rasvakian dictionary built into the XR-2000.

  If translation of the requested word is available, the terminal will respond with a TRANSLATION (0x16) packet. Otherwise, an RESULT (0x1f) packet will be sent.

  The contents of the packet is the word that needs to be translated.

  Possible errors:
  \* 0x12 translation limiting: send max 1 translate request per second.
  \* 0x50 translation not found

  0x16 TRANSLATION PACKET
  -----------------------
  This packet is sent by the terminal to provide a response to a TRANSLATE (0x20) packet.

  The contents of the packet is the Atlantian translation of the requested word.

  0x1f RESULT PACKET
  -----------------
  This packet is sent from the terminal to the client to indicate an error processing the last packet sent to the terminal.

  Some request types also have a RESULT packet as a response to indicate success. In this case the error type field value will be 0


I love that most of those designs give bigger houses to people in a spaceship than modern houses in the UK on planet earth.

Love the designs, doubt democracy would get them through more than 250 days, let alone 250 years.


> love that most of those designs give bigger houses to people in a spaceship than modern houses in the UK on planet earth

Would such a project be particularly volume constrained?

> doubt democracy would get them through more than 250 days, let alone 250 years

I don't. You'd be selecting for extraordinary individuals and educating them. These sorts of societies propagated for hundreds or even thousands of years in antiquity just fine.

The colonists be in a life-or-death system in a community small enough that everyone knows of everyone else personally. To the extent humans are almost uniquely exceptional at one thing as a hominid, it's exploration and colonization--I woudn't be surprised if this group winds up more functional due to scratching an underlying human need to explore and push boundaries.


>Would such a project be particularly volume constrained?

It would be mass constrained because of the sheer cost of getting it all into orbit, even with advanced tech such as space elevator. And more volumne = more mass.

There is a saying in aerospace design along the lines of 'weight breeds weight'. Heavier components necessitate stronger, and therefore heavier, supporting structures.


Are you telling me a country is more constrained by space than a spaceship?

As for democracy "These sorts of societies propagated for hundreds or even thousands of years in antiquity just fine" - I don't know of any that practised the consensus driven democracy that almost all these proposals use. Ant if you're reaching into antiquity then not even normal democracies. Unless you're talking about a Athens with their slaves and adult male citizen population having a vote. In which case sure, I can get behind that but that's not what those spaceship designs propose. They all assume all decisions will be unanimous and no one will ever break the law.

In actual fact history proves the opposite and all exploration and conquest is driven by strict hierarchical organisations and the idea that you can fly a spaceship across light years without a captain who can condemn people to death is laughable.


> a country is more constrained by space than a spaceship?

At the point that we're building 60 km spaceships, yes, I think that's a possibility.

> you're reaching into antiquity then not even normal democracies

The further back we go the more consensus-driven small societies get. I'm also reaching back due to familiarity. There are plenty of small island communities that did fine for generations on their own.

> They all assume all decisions will be unanimous and no one will ever break the law

Sorry, I missed this in the winning design. Where does it say that?

> all exploration and conquest is driven by strict hierarchical organisations

If you need to bring an army, yes. I don't think we know how hierarchical Polynesian settlers were.


> Sorry, I missed this in the winning design. Where does it say that?

I didn't notice any prisons included in the design, so that assumption seems fair.


> didn't notice any prisons included in the design, so that assumption seems fair

Does it?

You don’t need dedicated prison space as you won’t have a permanent prison population. (Depending on labour requirements and resource availability this may not be a choice.) Nothing about not having a prison implies no hierarchy. And you don’t need prisons to “condemn people to death.”


I guess Vatican could be ?


Hypothetically, if you designed something that resembled the UK and tried to have humans live in it, that would not be ethical.


Yankee go away


That includes women, children and elderly. If you count fighting age men only, 1M becomes significant. If you count men actually available for draft, you're already at 10% loss.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: