>> A magical frog was counting unicorns. He saw 5 purple unicorns, 2 green unicorns, and 7 pink unicorns. However, he made a mistake and didn't see 2 unicorns: one purple and one green. Also, since he was a magical frog, he didn't see unicorns that were the same color as himself. How many unicorns did he count?
It correctly answers 11 for me.
To me this has demonstrated:
* "Understanding": It understood that "didn't see" implies he didn't count.
* "Knowledge": It knew enough about the world to know that frogs are often green.
* "Reasoning": It was able to correctly reason about how many should be subtracted from the final result.
* "Math: It successfully did some basic additions and subtractions arriving at the correct answer.
Crucially, I made this up right here on the spot, and used a dice for some of the numbers. This question does not exist anywhere in the training corpus!
I think this demonstrates an impressive level of intelligence, for what up until about a year ago I thought a computer would ever be capable of in my lifetime. Now in absolute terms of course current gen ChatGPT is clearly far less good at reasoning and understanding than most people (well, specifically it seems to me that it's knowledge and reasoning are super-humanly broad, but child-level deep).
Can future improvements to this architecture improve the depth up to "AGI", whatever that means? I have no idea. It doesn't automatically seem impossible, but maybe what we see now is already near the limit? I guess only time will tell.
This puzzle is too poorly-worded to be solvable, due to the ambiguous nature of "see" and "count". Could you describe what the actual situation was, what the frog perceived it to be, and what color the frog was?
Ok, here's a (hopefully) better worded puzzle, again made up by myself right now.
There are 12 frogs. Five are green, 3 red, and 4 yellow. Two donkeys are counting the frogs. One of the donkeys is yellow, the other green. Each donkey is unable to see frogs that are the same color as itself, also each donkey was careless and missed a frog when counting. How many frogs does the green donkey count?
GPT4 answers 6 every time for me.
My point is that GPT is capable of a certain amount of "reasoning" about puzzles that most certainly don't exist in it's training data. Playing with it, it's clear that in this current generation the reasoning ability doesn't go very deep - just change the above puzzle a little to make it even slightly more complicated and it breaks. The amazing thing isn't how good at reasoning it is, but that a computer can reason at all.
It clearly says he didn't see some of them either at all or as unicorns. The correct answer is 11.
Edit: I do see now that "He saw" kind of messes the question up. My intent would have been better expressed with "There were". But again this proves my point! GPT4 is able to (most of the time) correctly work through the poor wording and interpret the question the way I meant it, and I think the way most people would read it.
the correct answer is 14. there is no logic/linguistic/semantic reason why "he didn't see a purple unicorn" should refer to the purple unicorn that he (according to your statement) did see. "he saw a red ball, but he didn't see one ball: a red one. how many balls did he see?". also regarding the green one ... there is no _logical_ reason why a "magical" frog should be green ... one can debate long about your question but a semantically sound interpretation implies: the frog saw 14 unicorns and the frog is not green. anything else falls apart because if the frog is green then how could he have seen a green uni? which is what you wrote for context.
I have a number of book length texts, most only in the target language, and a few bilingual or multilingual. For the bilingual and multilingual texts, I can script out probably several thousand pairs of "translate the following text from <source_lang> to <target_lang>: <source_lang_text> <target_lang_text>". Do I need to vary the prompt and format, or can I expect the LLM to generalize to different translation requests? Is there value in repeating the material in different lengths? One set of sentence lengths, another paragraph, and another page or chapter length? Also what should be done with the monolingual texts, just ignore them?
If you want to fine tune Llama 2 or similar, then embed each pair together and separately and store them. Then, use the unlabeled data (the source text without translation) to query the embeddings for similar matches. You then send in the necessary prompt text with the matches, plus the text to translate. You'll want to do this with a foundational model, like GPT-x.
As noted below, extracting words or keyterms would maybe be a good idea, as they could be included in the training set.
The training set would the be comprised of the prompt, the translation, and keyterms. As you will want to vet the generated texts anyway, you could then decide if the foundational model was working enough. You could also try to run the largest "open" model you could find on the prompts, to see if those needed training as well. There are many different Llama models trained on HuggingFace for language pairs, so see if your languages are already built and test those.
I'm building a simple, Open Source ML pipeline manager at https://ai.featurebase.com/. I'd be down to help you with this!
Language translation can be tricky because of the underlying nuances in each language so more context would probably be better, but using multiple steps to evaluate its performance on a key level would be a good way to improve the confidence.
It might be beneficial to start your dataset at the key (word) level, generate some embeddings of the key pair in the source and target and stash them, then do the same for sentence level and just for fun, paragraph level. (I believe you could get enough context from the sentence level as a paragraph is just a group of sentences but it would still be interesting to generate paragraph level key pairs I think).
From there you’d have a set of embeddings of each word src:tgt that also has context of how it fits in a sentence level and paragraph level with the respective nuances of each language.
Once you have that dataset then you can augment your data with prompts like you’re using but also including some contextual references of word pairs, and sentence pairs in your prompt which should corner the LLM into the right path.
Edit: not an expert so will heed if someone smarter comes along.
Oh, yes, pairs of words is a good idea. I also have a bilingual dictionary and can generate a prompt for each entry something like "here's a word in <lang_a>, write a dictionary definition for it in <lang_b>: <lang_a_word>: <lang_b_definition".
People who practice Breatharianism claim to live without food and get sustenance from various mystical sources, including sunshine (they either don't live very long or else are frauds and do secretly eat).
This is something I've tried to figure out this week.
I tried Youtube Kids because I wanted the content to be pre-approve only, I liked the toddler friendly controls, and the app can go in single app mode on the iPad. It's subscription, which is fine, but the main problem is they won't let me approve the content I want. I can see some great kids stories and learning videos in our language on regular Youtube, but can't seem to find a way to get them into the Youtube kids app. (And I certainly am not going to turn her loose on regular Youtube.)
There don't seem to be any other apps that do what I want so I ended up setting up a Plex media server and use yt-dlp to download the videos for her. This works pretty well, but is a lot more work. And the app is not great.
> The attacker conducted a series of sophisticated voice phishing attacks under the guise of various trusted organizations attempting to convince the victim to accept multi-factor authentication (MFA) push notifications initiated by the attacker.
This is why we need webauthn everywhere! Push notifications and friends are all susceptible to this kind of attack, webauthn is not.
I've been text messaging with random Russians a bunch over the last couple weeks and have had conversations with ~100 people, some of them quite extensive. Not once has anyone really condemned the war. A few are cynical about their government, people, and the rest of the world, and sad that this has to happen. But the vast majority weren't even that. Just straight up nationalist, repeatedly saying that they had to do something or they would be destroyed by the west. And insisting that Ukraine belonged to Russia. Absolutely no-one denied the facts as far as the toll on human life or the destruction of Ukraine, and not a single person thought its a "special operation" instead of a war. I heard a lot of what seems to me to be extreme paranoia about NATO, America, and Nazis just waiting for an opportunity to invade and destroy them.
The whole experience has been kind of shocking to me. I went into this believing that Russians where like innocent captives of evil politicians and oligarchs and had no idea what was going on. But based on the conversations I've had, that is not the case at all. They know exactly what is going on as far as the cost in human lives, and believe it's necessary to ensure their own survival.
You know, if I lived in Russia right now, and some random person I never heard of started texting me and talking about the war, I'd at least have to consider the possibility that they were FSB, and guard my words accordingly.
Not saying that all the people you texted thought that way, but I'm not sure you can reach a definitive conclusion based on your experiment.
This is a great point! Without verifying the entity on the other end of the line and the risk that my words could be used against me—and my loved ones—I’d toe the line and parrot back the official position of the corrupt gov’t. What’s the upside to speak out with some random SMS?
Yes, of course it's likely there are some who oppose the war but are uneasy about expressing that to a stranger. I was just surprised to get no support at all.
I think you need to watch this one[0], it's an explanation of what you are seeing. In general, that is the propaganda doing it's job. It's similar to all countries, we all have our propaganda and our view of our situation in the world. Russias is the classical "It's us against them! And we are surrounded by our enemies!". I'm sure North Korea is about the same only worse.
I haven't really talked to anyone who's contested the fact that thousands of Russian soldiers and Russian speaking Ukrainian civilians have been killed.
> ...They know exactly what is going on as far as the cost in human lives, and believe it's necessary to ensure their own survival.
Given the propaganda and the extensive coverups by govt, it's not really clear what exactly the people know in terms of counts, but for sure the scale of the war is hard to cover up by now.
However, knowing the actual facts is rather irrelevant for them. The pervious decades of "raising from their knees" in fact gave even more rise to the cynicism - another product of the "developed socialism".
Thus lots of educated citizens of Russia would readily accept and justify any form of twisted falsehood as a given, kind of rules of engagement for their daily act of survival.
Sadly, similar cyncism seems to have its place in lives of may people from the former Soviet republics, Ukraine included. Just in Ukraine, there happened to be a practical succession of government and presidents. This has been powering a hope for changes, better future, access to Europe. Meanwhile, the Russian Federation got clearly stuck "in-progress", same as Belarus, and the Central Asian republics. Sans alternative.
How does one reconcile with such a reality? Well, if unable to leave, then adopting some cynical attitude to anything may perhaps become a choice.
I had a good laugh when I saw my grandma using her iPad calculator app to do her finances. The app was displaying an an ad, she didn't want to see it and stuck a sticker over that part of the screen. :-)
Is my grandma's sticker going to be illegal in this forced advertising future? If not, where in the stack between the webpage and your eyes will it start to be legal to block? What about blocking in the graphics stack in the OS? The display? A sticker in front of the display? A smart film in front of the display? Smart glasses? Will it be illegal to avert your eyes from advertisements, à la Black Mirror?
The Sgaw Karen language from Thailand and Burma has a system of what I've always called classifiers, but reading this, it sounds like they might be properly called genders too?
Everything is classified into one of a bunch of classifiers, where a bunch might be a few dozen? Maybe hundreds, I'm not sure.
Some really common ones:
Rational beings like people, God, angles, etc. = "gha" (but not spirits, demons, ghosts, etc.; they are animals)
Flat things like the earth, plates, leaves, fields, the sky, etc. = "bae" (but modern Karens sometimes use "round" for the earth and moon instead)
Round things like balls, houses, rocks, a person or animals head, eyes, etc. = "pler"
Long skinny things like a stick, snake, road, etc = "bo"
Most kinds of animals = "doo" (but fish and birds are flat, and insects are round)
These words show up all over the place in basic grammar. Like "5 cows" would be "cows 5 doo". Sometimes they stand in for the actual name of what you're talking about, for example you might say "this cow" as "ta doo ee" and drop the word for cow entirely.
> classifiers, but reading this, it sounds like they might be properly called genders too?
No, not quite, classifiers do not introduce a notion of the noun class/gender, and the East/South East Asian languages that make an extensive use them (notably, Sino-Burmese and Tai-Kradai languages) remain being fully analytical languages.
A more apt comparison for classifiers would be collective nouns, of which English (and other Indo-European languages) has plenty, e.g.
– a pandemonium of enterprise architects;
– a tuxedo of Linux kernel developers;
– a dazzle of birds of paradise;
– a shiver of IT consultants;
– etc.
where the implicitly associated noun class/gender of the noun that is the focal point of the expression is «debased» into a collective one and can be applied across the noun class/gender boundaries. Pandemonium, dazzle, tuxedo and shiver are, effectively, classifiers.
Is it used only for counting? Mandarin and other East Asian language have a classifier for counting. English uses counting classifier sometimes. For example, "paper" without a classifier in English refers to an official document or essay. For example "I have to write a paper" or "Do you have your papers?" otherwise you have to use "a sheet of paper", "some paper", "a pack of paper" etc. It's not exactly the same as the counting words in Mandarin but play a similar role in grammer.
Yes, sort of, but a lot more than counting. I think Thai has counting words too, but I think they are not as central to the grammar or as flexible as Karen classifiers.
Sort of like you can say "a sheet of paper" or "5 sheets of paper" in English to count papers, but imagine you could also say "typing on a sheet", or "your sheet is full of typos", or "could you hand me a sheet", where "sheet" is a broad category and that you mean a sheet of paper comes from the context in which the sentence is spoken.
Edit: Another interesting use for them is disambiguation. Super useful if, like me, you're just learning and don't always nail the tones or pronunciation. For example, I might throw in the animal classifier in "ga cha ta doo" just to make sure no-one misunderstands my poor pronunciation of "elephant" as "mountain". That's a crude example, but native speakers benefit from the disambiguation too in colloquial speech.
Words with this range of meaning are simply called ‘noun classifiers’ in linguistics [0]. They’re pretty common both in the region and outside it: they’re also in Lao, Hmong and Minangkabau, as well as various Australian, Mayan and South American languages.
The moon doesn't visibly have bulk or mass like a ball, house, or mountain does. And I was referring to earth as the entire world. Edit: But the landscape, fields, meadows, districts, states, countries, and continents are all still flat.
Edit 2: Now that I think about it, I should have used "spherical" to describe that category rather than "round."
Ha! Around that age (actually probably a little older) I begged and pleaded with my mom to get saltpeter and sulfur to make firecrackers and rockets. Eventually she relented and got it for me, but with the stipulation I had to wear a welding helmet, heavy coat, and heavy rubber gloves while working with it, and that only under supervision... Anyway, the actual gunpowder I managed to make didn't really work, just kind of fizzled out. I never managed to get a firecracker to actually make any noise. It sure was a lot of fun though. :-)
I want an appliance that combines the functions of refrigerator, pantry, stove, oven, dishwasher, and that has manipulators inside capable of picking the ingredients, cleaning, peeling, chopping, and otherwise preparing them, cooking the entire meal, cleaning up, and spiting out the meal on serving dishes. :-)
It should be able to cook good tasting homemade meals from real ingredients (even fresh from the garden if you have a garden), not just plastic wrapped junk food. You load food items through hopper or door in one side, then a manipulator arm picks them up, reads or otherwise identifies what they are, and stores them in the inbuilt pantry or refrigerator.
It should keep an inventory of the refrigerator and panty and be able to suggest recipes that use what you have on hand while optimizing for cost, nutrition, and minimizing wasted food from spoilage. It should be able to produce shopping lists for you based on foods you use frequently or recipes you want to cook in the future. (It could even auto order foods and have them delivered, although I like the idea of just getting a shopping list and buying it myself better.)
It should be able to help you avoid wasting leftover foods by reaccepting leftovers, storing them, displaying the leftovers as a suggestion for the next meals so you don't forget, and know the optimal way to reheat or re-prepare them.
It should be controlled through an app on your phone, over the local network, and definitely not dependent on some cloud service. You should be able to still get a nice hot supper if the Internet goes down.
The app should show you a list of recipes that can be made with the ingredients on hand, you pick one or a couple, it tells you how long it's going to take to prepare. You tap go, and 45min later you have a perfect homecooked supper!
Maybe so.. but technology wise I don't think there's anything here that doesn't exist yet or that is even exceptionally advanced. It's just a matter of putting it all together and writing the code.
I'm skeptical--this market would be so incredibly huge that I cannot fathom everyone overlooking it.
Literally everyone would get this. It would make the world so much healthier--enable everyone to easily diet, let people subscribe to healthy meal plans, etc.
It would pay for itself even at astronomical prices, via health benefits, no need to go out for dinner, and reduced food prices via making things from scratch.
Buying in bulk would be more common and much easier. So much comes from grain, sugar, salt and yeast. Much less food would be wasted--no need to batch cook anything and ignore the leftovers. The robot would be baking your bread, making donuts and croissants, tortillas, tortilla chips, mayonnaise, yogurt, sauerkraut, pickles. Things we never consider making on our own, that are generally considered the base components to make other things. I think it could easily offset $300-$1k in monthly expenses for just about everyone.
My assumptions are that it can be tailored to follow a recipe, of which many widely-published recipes are created/curated by chefs and experts, and once that is done the robot rarely, if ever, fails to produce the very best version of the dish.
If the only hindrance is just sitting down and writing the code, someone ought to get busy, because this is about as impactful of a development for the planet as any could be.
>> A magical frog was counting unicorns. He saw 5 purple unicorns, 2 green unicorns, and 7 pink unicorns. However, he made a mistake and didn't see 2 unicorns: one purple and one green. Also, since he was a magical frog, he didn't see unicorns that were the same color as himself. How many unicorns did he count?
It correctly answers 11 for me.
To me this has demonstrated:
* "Understanding": It understood that "didn't see" implies he didn't count.
* "Knowledge": It knew enough about the world to know that frogs are often green.
* "Reasoning": It was able to correctly reason about how many should be subtracted from the final result.
* "Math: It successfully did some basic additions and subtractions arriving at the correct answer.
Crucially, I made this up right here on the spot, and used a dice for some of the numbers. This question does not exist anywhere in the training corpus!
I think this demonstrates an impressive level of intelligence, for what up until about a year ago I thought a computer would ever be capable of in my lifetime. Now in absolute terms of course current gen ChatGPT is clearly far less good at reasoning and understanding than most people (well, specifically it seems to me that it's knowledge and reasoning are super-humanly broad, but child-level deep).
Can future improvements to this architecture improve the depth up to "AGI", whatever that means? I have no idea. It doesn't automatically seem impossible, but maybe what we see now is already near the limit? I guess only time will tell.