In a sense, Adrian Thompson kicked this off in the 90's when he applied an evolutionary algorithm to FPGA hardware. Using a "survival of the fittest" approach, he taught a board to discern the difference between a 1kHz and 10KHz tone.
The final generation of the circuit was more compact than anything a human engineer would ever come up with (reducible to a mere 37 logic gates), and utilized all kinds of physical nuances specific to the chip it evolved on - including feedback loops, EMI effects between unconnected logic units, and (if I recall) operating transistors outside their saturation region.
That's not a lot of discussion—we should have another thread about this sometime. If you want to submit it in (say) a week or two, email hn@ycombinator.com and we'll put it in the second-chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308), so it will get a random placement on HN's front page.
If you’re up for sharing, I’m curious to know approximately how many hours each week you spend working on HN. It seems like it would be an enormous amount of time, but I’m just guessing.
I suspect cloning tech is out there and Dang(s) are one of the first successful iterations. I just dont get how there is seemingly no time off, no vacations, sick days etc. Talk about passion.
Other alternative is the image of pale guy with laptop on some beautiful beach or climbing some crazy peak. Same passion, just concentrated in 1 body.
Sorry for the confusion! I know it's weird but the alternative turns out to be even more confusing and we've never figured out how to square that circle.
I think dang did something manual to push it back to the frontpage, and that reset the timestamps on everyone’s existing comments…
There is a comment here by me which says “2 hours ago”, I swear I wrote it longer ago than that - indeed, my threads page still says I wrote it 20 hours ago, so it is like part of the code knows when I really wrote it, another part now thinks I wrote it 18 hours later than I did…
Yes, the relativized timestamps only show on /news (i.e. the frontpage) and /item pages. You can always see the original timestamps on other pages, like /submitted, /from, or (as you say) /threads.
Edit: I checked the code and the actual list is:
'(news item reply show ask active best over classic).
Operating transistors outside the linear region (the saturated "on") on a billion+ scale is something that we as engineers and physicists haven't quite figured out, and I am hoping that this changes in future, especially with the advent of analog neuromorphic computing. The quadratic region (before the "on") is far more energy efficient and the non-linearity could actually help with computing, not unlike the activation function in an NN.
Of course, the modeling the nonlinear behavior is difficult. My prof would say for every coefficient in SPICE's transistor models, someone dedicated his entire PhD (and there are a lot of these coefficients!).
I haven't been in touch with the field since I moved up the stack (numerical analysis/ML) I would love to learn more if there has been recent progress in this field.
The machine learning model didn’t discover something that humans didn’t know about. It abused some functions specific to the chip that could not be repeated in production or even on other chips or other configurations of the same chip.
That is a common problem with fully free form machine learning solutions: They can stumble upon something that technically works in their training set, but any human who understood the full system would never actually use due to the other problems associated with it.
> The quadratic region (before the "on") is far more energy efficient
Take a look at the structure of something like CMOS and you’ll see why running transistors in anything other than “on” or “off” is definitely not energy efficient. In fact, the transitions are where the energy usage largely goes. We try to get through that transition period as rapidly as possible because minimal current flows when the transistors reach the on or off state.
There are other logic arrangements, but I don’t understand what you’re getting at by suggesting circuits would be more efficient. Are you referring to the reduced gate charge?
> Take a look at the structure of something like CMOS and you’ll see why running transistors in anything other than “on” or “off” is definitely not energy efficient. In fact, the transitions are where the energy usage largely goes. We try to get through that transition period as rapidly as possible because minimal current flows when the transistors reach the on or off state.
Sounds like you might be thinking of power electronic circuits rather than CMOS. In a CMOS logic circuit, current does not flow from Vdd to ground as long as either the p-type or the n-type transistor is fully switched off. The circuit under discussion was operated in subthreshold mode, in which one transistor in a complementary pair is partially switched on and the other is fully switched off. So it still only uses power during transitions, and the energy consumed in each transition is lower than in the normal mode because less voltage is switched at the transistor gate.
> In a CMOS logic circuit, current does not flow from Vdd to ground as long as either the p-type or the n-type transistor is fully switched off.
Right, but how do you get the transistor fully switched off? Think about what happens during the time when it’s transitioning between on and off.
You can run the transistors from the previous stage in a different part of the curve, but that’s not an isolated effect. Everything that impacts switching speed and reduces the current flowing to turn the next gate on or off will also impact power consumption.
There might be some theoretical optimization where the transistors are driven differently, but at what cost of extra silicon and how delicate is the balance between squeezing a little more efficiency and operating too close to the point where minor manufacturing changes can become outsized problems?
The previous poster was probably thinking about very low power analog circuits or extremely slow digital circuits (like those used in wrist watches), where the on-state of the MOS transistors is in the subthreshold conduction region (while the off state is the same off state as in any other CMOS circuits, ensuring a static power consumption determined only by leakage).
Such circuits are useful for something powered by a battery that must have a lifetime measured in years, but they cannot operate at high speeds.
Unfortunately not. This is analogous to writing a C program that relied on undefined behavior on the specific architecture and CPU of your developer machine. It’s not portable.
The behavior could change from one manufacturing run to another. The behavior could disappear altogether in a future revision of the chip.
The behavior could even disappear if you change some other part of the design that then relocated the logic to a different set of cells on the chip. This was noted in the experiment where certain behavior depended on logic being placed in a specific location, generating certain timings.
If you rely on anything other than the behavior defined by the specifications, you’re at risk of it breaking. This is a problem with arriving at empirical solutions via guess and check, too.
Ideally you’d do everything in simulation rather than on-chip where possible. The simulator would only function in ways supported by the specifications of the chip without allowing undefined behavior.
>The behavior could change from one manufacturing run to another. The behavior could disappear altogether in a future revision of the chip.
That's the overfitting they were referring to. Relying on the individual behaviour is the overfit. Running on multiple chips (at learning time) reduces the benefit of using an improvement that is specific to one chip.
You are correct that simulation is the better solution, but you have to do more than just limit to the operating range of the components, you have to introduce variances similar to the specified production precision. If the simulator made assumptions that the behaviour of two similar components was absolutely identical to each other then within tolerance manufacturing errors could be magnified.
In other words, optimization algorithms in general are prone to overfitting. Fortunately there are techniques to deal with that. Thing is, once you find a solution that generalize better to different chips, it probably won't be as small as the solution found.
I'm having trouble understanding. Chips with very high transistor counts tend to use saturation/turn-off almost exclusively. Very little is done in the linear region because it burns a lot of power and it's less predictable.
Oh christ you're right, they were actually being really funny. I was being super literal and imagined them being very excited about futuristic advances in giant isopod diagnosis and care
I really wish I still had the link, but there used to be a website that listed a bunch of times in which machine learning was used (mostly via reinforcement learning) to teach a computer how to play a video game and it ended up using perverse strategies that no human would do. Like exploiting weird glitches (https://www.youtube.com/watch?v=meE5aaRJ0Zs shows this with Q*bert)
In my thesis many years ago [0] I used EAs to build bicycle wheels. They were so annoyingly good at exploiting whatever idiosyncrasies in my wheel-simulator. Like, the first iterations of my simulator it managed to evolve wheels that would slowly oscillate due to floating point instability or something, and when applied forces to it would increase and increase until the whole simulator exploded and the recorded forces were all over the place, of course then out-competing any wheel in at least some objective dimension.
After fixing those bugs, I mostly struggled with it taunting me. Like building a wheel with all the spokes going from the hub and straight up to the rim. It of course would break down when rolling, but on the objective of "how much load can it handle on the bike" it again out-competed every other wheel, and thus was at the pareto-front of that objective and kept showing up through all my tests. Hated that guy, heh. I later changed it to test all wheels in at least 4 orientations, it would then still taunt me with wheels like (c) in this figure[1], exploiting that.
My favorite was the ML learning how to optimally make the lowest-impact landing in a flight simulator— it discovered that it could wrap the impact float value if the impact was high enough so instead of figuring out the optimal landing, it started figuring out the optimal path to the highest-impact crashes.
This comment ought to be higher up. Such a perfect summary of what I have struggled to understand, which is the “danger” of AI once we allow it to control things
And yes you can fix the bug but the bike wheel guy shows you there will always be another bug. We need a paper/proof that invents a process that can put an AI-supported (non human intervention) finite cap or limiter or something on the possible bug surface
Conglomerate developed an AI and vision system that you could hook up to your Anti-aircraft systems to eliminate any chance of friendly fire. DARPA and the Pentagon went wild, pushing the system through test so they could get to the live demonstration.
They hook up a live and load up dummy rounds system, fly a few friendly planes over and everything looks good however when they fly a captured Mig-21 over the system fails to respond. The Brass is upset and the engineers are all scratching their heads trying to figure out what is going on but as the sun sets the system lights up, trying to shoot down anything in the sky.
They quickly shut down the system and do a postmortem, in the review they find that all the training data for friendly planes are perfect weather, blue sky overflights and all the training data for the enemy are nighttime/ low light pictures. The AI determined that anything fling during the day is friendly and anything at night is terminate with extreme prejudiced.
we used synthetic data for training a (sort of) similar system. not gonna get into the exact specifics, but we didn't have a lot of images of one kind of failure use-case.
like they're just not that many pictures of this stuff. we needed hundreds, ideally thousands, and had, maybe, a dozen or so.
okay, so we'll get a couple of talented picture / design guys from the UI teams to come out and do a little photoshop of the images. take some of the existing ones, play with photoshop, make a couple of similar-but-not-quite-the-same ones, and then hack those in a few ways. load those into the ML and tell em they're targets and to flag on those, etc. etc.
took a week or two, no dramas, early results were promising. then it just started failing.
turns out we ran into issues with two (2) pixels, black pixels against a background of darker black shades, that the human eye basically didn't see or notice; these were artifacts from photoshopping, and then re-using parts of a previous image multiple times. the ML started determining that 51% or more of the photos had those 2 pixels in there, and that photos lacking those -- even when painfully obvious to the naked eye -- were fails.
like, zooming in at it directly you're like yea, okay, those pixels might be different, but otherwise you'd never see it. thankfully output highlighting flagged it reasonably quickly but still took 2-3 weeks to nail down the issue.
Simplification is the problem here, arguably. Even a simple-sounding objective (say, a bicycle wheel that holds load the best) has at least one implicit assumption - it will be handled and used in the real world. Which means it'll be subject of sloppy handling and thermal spikes and weather and abuse and all kinds of things that are not just meeting the goal. Any of those cheesy AI designs, if you were to 3D-print/replicate them, they'd fall apart as you picked them up. So the problem seems to be, ML algorithm is getting too simple goal function - one lacking the "used in the real world" part.
I feel that a good first step would be to introduce some kind of random jitter into the simulation. Like, in case of the wheels, introduce road bumps, and perhaps start each run by simulating dropping the wheel from a short distance. This should quickly weed out "too clever" solutions - as long as the jitter is random enough, so RL won't pick up on it and start to exploit its non-randomness.
Speaking of road bumps: there is no such thing in reality as a perfectly flat road; if the wheel simulator is just rolling wheels on mathematically perfect roads, that's a big deviation from reality - precisely the kind that allows for "hacky" solutions that are not possible in the real world.
You would have to introduce jitter to every possible dimension, when the dimensions themselves are continually expanding (as illuminated by the bike wheel example).. the combination of jitter x dimensions leads to an undefined problem (AKA theory of everything) in exponential fashion
My point was that instead of blaming ML - or optimisation tools really - for gaming objective functions and coming up with non-solutions that do maximise reward, AI could instead be used to measure the reward/fitness of the solution.
So to the OP's example "optimise a bike wheel", technically an AI should be able to understand whether a proposed wheel is good or not, in a similar way to a human.
Humans don't simplify problems by reducing them to objective functions: we simplify them by reducing them to specific instances of abstract concepts. Human thought is fundamentally different to the alien processes of naïve optimising agents.
We do understand the "real objectives", and our inability to communicate this understanding to hill-climbing algorithms is a sign of the depth of our understanding. There's no reason to believe that anything we yet call "AI" is capable of translating our understanding into a form that, magically, makes the hill-climbing algorithm output the correct answer.
All these claims are like "programming is impossible because I typed in a program and it had a bug". Yes, everyone's first attempt at a reward function is hackable. So you have to tighten up the reward function to exclude solutions you don't want.
Is that Learnfun/Playfun that tom7 made? That one paused just before losing on tetris and left it like that, because any other input would make it lose
Make no mistake most humans will exploit any glitches and bugs they can find for personal advantage in game. It’s just machines can exploit timing bugs better.
Some people are able to do frame perfect inputs semi consistently from what I understand. I don’t understand how, as my own performance is around hitting 100ms window once, every other time
If you're using a typical PC (or $deity forbid, a phone) with a typical consumer OS, there's several sources of variability between your controller and the visual feedback you receive from the game, each of which could randomly introduce delays on the order of milliseconds or more. That "randomly" here is the key phrase - lag itself is not a problem, the variability is.
There’s a few very cool examples where someone recently used RL to solve trackmania, and ends up having to add all sorts of constraints/penalties to prevent extremely strange exploits/glitches that are discovered IIRC… been a while since I watched.
Well, in the case of the latter, there was a vaguely known glitch for driving on the nose that allowed for better speeds than possible on 4 wheels, but it would be completely uncontrollable to a human. He figured out how to break the problem down into steps that the NN could gradually learn piecewise, until he had cars racing around tracks while balancing on their nose.
It turned out to have learned to keep the car spinning on its nose for stability, and timing inputs to upset the spinning balance at the right moment to touch the ground with the tire to shoot off in a desired direction.
I think the overall lesson is that, to make useful machine learning, we must break our problems down into pieces small enough that an algorithm can truly "build up skills" and learn naturally, under the correct guidance.
Haha that was actually the same one I posted in my comment.
This was some old website. A coworker sent it to me on Hipchat at my previous job about 10 years ago. And finding anything online older than like 5 years is nearly impossible unless you have the exact URL on hand.
Oh sorry! I recognized the description but since I recalled mine being a Google Sheets link, I just went straight into search mode - and yep, it actually took me a bit to find.
For the model, the weird glitches are just another element of the game. As they can't reason, have no theory of world or even any real knowledge of what is doing, the model don't have the prior assumptions a human would have about how the game is supposed to be played.
If you think about it, even using the term "perverse" is a result of us antropomorphizing any object in the universe that does anything we believe is on the realm of things humans do.
Not quite what you're describing, but no one has yet linked the classic Tom7 series where he applies deep learning to classic NES games: https://youtu.be/xOCurBYI_gY
I've referenced this paper many times here; it's easily in my top 10 of papers I've ever read. It's one of those ones that, if you go into it blind, you have several "Oh no f'king way" moments.
The interesting thing to me now is... that research is very much a product of the right time. The specific Xilinx FPGA he was using was incredibly simple by today's standards and this is actually what allowed it to work so well. It was 5v, and from what I remember, the binary bitstream to program it was either completely documented, or he was able to easily generate the bitstreams by studying the output of the Xilinx router- in that era Xilinx had a manual PnR tool where you could physically draw how the blocks connected by hand if you wanted. All the blocks were the same and laid out physically how you'd expect. And the important part is that you couldn't brick the chip with an invalid binary bitstream programming. So if a generation made something wonky, it still configured the chip and ran it, no harm.
Most all, if not all modern FPGAs just cannot be programmed like this anymore. Just randomly mutating a bitstream would, at best, make an invalid binary that the chip just won't burn. Or, at worst, brick it.
I remember this paper being discussed in the novel "Science of Discworld" -- a super interesting book involving collaboration between a fiction author and some real world scientists -- where the fictional characters in the novel discover our universe and its rules ... I always thought there was some deep insight to be had about the universe within this paper. Now moreso I think the unexpectedness says something instead about the nature of engineering and control and human mechanisms for understanding these sorts of systems ... -- sort of by definition human engineering relies on linearized approximations to characterize the effects being manipulated -- so something which operates in modes far outside those models is basically inscrutable. I think that's kind of expected but the results still provoke the fascination to ponder the solutions super human engineering methods might yet find with the modern technical substrates.
Xe highly recommend the series! Xe keep going back to them for bedtime audio book listening. Chapters alternate between fact and fiction and the mix of intriguing narrative and drier but compelling academic talk help put xir otherwise overly busy mind to rest. In fact, xe bought softcover copies of two of them just last week.
The science is no longer cutting edge (some are over twenty years old) but the deeper principles hold and Discworld makes for an excellent foil to our own Roundworld, just as Sir Pratchett intended.
Indeed, the series says more about us as humans and our relationship to the universe than the universe itself and xe love that.
People don't necessarily choose their own pronouns based on how it will reflect on an oppressed group, and they don't necessarily intend to be representing a group when representing themselves.
IIRC the flip-side was that it was hideously specific to a particular model and batch of hardware, because it relied on something that would otherwise be considered a manufacturing flaw.
Not even one batch. It was specific to that exact one chip it was evolved on. Trying to move it to another chip of the same model would produce unreliable results.
There is actually a whole lot of variance between individual silicon chips, even two chips right next to each other on the wafer will preform slightly differently. They will all meet the spec on the datasheet, but datasheets always specify ranges, not exact values.
If I recall the original article, I believe it even went a step further. While running on the same chip it evolved on, if you unplugged the lamp that was in the closest outlet the chip the chip stopped working. It was really fascinating how environmentally specific it evolved.
That said, it seems like it would be very doable to first evolve a chip with the functionality you need in a single environment, then slowly vary parameters to evolve it to be more robust.
Or vice versa begin evolving the algorithm using a fitness function that is the average performance across 5 very different chips to ensure some robustness is built in from the beginning.
> slowly vary parameters to evolve it to be more robust
Injecting noise and other constraints (like forcing it place circuits in different parts of the device) are totally valid when it needs to evolve in-place.
For the most part, I think it would be better to run in a simulator where it can evolve against an abstract model, then it couldn't overfit to the specific device and environment. This doesn't work if the best simulator of the system is the system itself.
Yeah, if you took it outside the temperature envelope of the lab it failed. I guess thermal expansion?
There were also a bunch of cells that had inputs, but no outputs. When you disconnected them... the circuit stopped working. Shades of "magic" and "more magic".
I've never worked with it, but I've had a fascination with GA/GP ever since this paper/the Tierra paper. I do wonder why it's such an attractive technique - simulated annealing or hill climbing just don't have the same appeal. It's the biological metaphor, I think.
long time ago, maybe in russian journal "Radio" ~198x, there was someone there describing that if one gets certain transistor from particular batch of particular factory/date, and connect it in whatever weird way, will make a full FM radio (or similar-complex-thing).. because they've wronged the yields. No idea how they had figured that out.
But mistakes aside, what would it be if the chips from the factory could learn / fine-tune how to work (better) , on the run..
At my highschool, we had FM radio transmitter on the other side of street. Pretty often you could hear one of the stations in computer speakers in library, so FM radio can be detected by simple analog circuits.
I remember talking about this with my friend and fellow EE grad Connor a few years ago. The chip's design really feels like a biological approach to electrical engineering, in the way that all of the layers we humans like to neatly organize our concepts into just get totally upended and messed with.
The interesting thing about this project is that it shouldn’t even be possible if the chip behaved as an abstract logical circuit since then it would simply implement a finite automation. You must abuse the underlying physics to make the logic gates behave like something else.
I strongly dislike when people say AI when they actually mean optimizer. Calling the product of an optimizer “AI” is more defensible, you optimized an MLP and now it writes poetry. Fine. Is the chip itself the AI here? That’s the product of the optimizer. Or is it the 200 lines of code that defines a reward and iterates the traces?
Optimization is near and dear to my heart (see username), but I think it’s fine to call optimization processes AI because they are in the classical sense.
I think it is as follows: We call it AI nowadays as long as we cannot clearly easily show how to get to the result, which means the computer did something that seems intelligent to us for the moment. Once we can explain things and write down a concise algorithm, we hesitate to call it AI.
Basically, we call things AI, that we are too stupid to understand.
I think what’s really happened is we get output that’s close enough to normal communications to “feel” human. I could say it’s all a giant trick, which it kind of is, but we’ve also gotten to the point where the trick it also useful for many things that previously didn’t have a good solution.
> I strongly dislike when people say AI when they actually mean optimizer.
It probably uses a relatively simple hill climbing algorithm, but I would agree that it could still be classified as machine learning. AI is just the new, hip term for ML.
What? Quite the opposite. AI is the original and broader term, ML is a subset of AI. Deep Learning was the "hot" terminology around 2015-2018, and since 2022/Chatgpt, LLM has become the trendy word. Yes, people now talk about "AI" as well, but that term has always been there, and anytime some AI technique becomes talked about, the term AI gets thrown around a lot too.
(Note - I may have misunderstood your meaning btw, if so apologies!)
The "OG" AI research, like the the era of Minsky's AI Lab at MIT in the 1970s, broke AI into a few sub-domains, of which optimization was one. So long before we used the term AI to describe an LLM-based chat bot, we used it to describe optimization algorithms like genetic algorithms, random forests, support vector machines, etc.
My point is that it’s equally ridiculous to call either AI. If our chip here is not the AI then the AI has to be the optimizer. By extension that means AdamW is more of an AI than ChatGPT.
I don't understand. I learnt about optimizers, and genetic algorithms in my AI courses. There are lots of different things we call AI, from classical AI (algorithms for discrete and continuous search, planning, sat, Bayesian stuff, decision trees, etc.) to more contemporary deep learning, transformers, genAI etc. AI is a very very broad category of topics.
Optimization can be a tool used in the creation of AI. I'm taking issue with people who say their optimizer is an AI. We don't need to personify every technology that can be used to automate complex tasks. All that does is further dilute an already overloaded term.
I agree that the article is wrong in using the wording “the AI”. However, firstly the original publication [0] doesn’t mention AI at all, only deep-learning models, and neither do any of the quotes in the article. Secondly, it is customary to categorize the technology resulting from AI research as AI — just not as “an AI”. The former does not imply any personification. You can have algorithms that exhibit intelligence without them constituting any kind of personal identity.
You can remove the word 'an' if you're attributing some weird meaning to it, the point is still valid. Genetic algorithms and optimizers are usually in there to make AI algorithms, they aren't themselves AI algorithms.
And you have to be doing something rather specific with a pile of if statements for it to count as an expert system.
I think that's a bit different. The term is overloaded. There's "the machine is thinking" AI and then there's "this fairly primitive code controls an agent" AI. The former describes the technique while the latter describes the use case.
Artificial intelligence, as others are using it here to cover a broad field of study or set of techniques. You seem to be objecting because the described product is not "an artificial intelligence", i.e. an artificial mind.
For some of us, your objection sounds as silly as if we were to tell some student they didn't use algebra, because what they wrote down isn't "an algebra".
You use optimization to train AI, but we usually refer to AI as being the parametrized function approximator that is optimized to fit the data, not the optimizer or loss function themselves.
This is "just" an optimizer being used in conjunction with a simulation, which we've been doing for a long, long time. It's cool, but it's not AI.
Optimization is a branch of mathematics concerned with optimization techniques, and the analysis and quality of possible solutions. An optimizer is an algorithm concerned with finding optima of functions. You don't get to rewrite decades of mathematical literature because it gives you AI vibes.
Yeah, you need an optimizer to train AI, but it's not the AI part. Most people would refer to and understand AI as being the thing they interact with. You can't interact with an optimizer, but you can interact with the function that is being optimized.
I'm honestly stunned that this is even a controversial position.
FWIW, I suspect there are more folks here with exposure to decades of computer science literature about AI than to comparable mathematics literature.
The CS literature has used AI to refer to nearly any advanced search algorithm, e.g. during the prior AI boom and bust cycle around symbolic AI. In this literature, it is idiomatic that AI techniques are the broad category of search and optimization techniques. There wasn't necessarily any "training" involved, as machine learning was considered part of the AI topic area but not its entirety.
It's always been acknowledged that various disciplines had significant crossover, e.g. ML and operations research, but I've never seen anyone claim that optimization is AI until recently.
Ian Goodfellow's book is, what, 10 years old at this point? The fundamentals in that book cover all of ML from classical to deep learning, and pretty clearly enumerate the different components necessary to do ML, and there's no doubt that optimization is one of them. But to say that it is AI in the way that most people would probably understand it? It's a stretch, and hinges on whether you're using AI to refer to the collection of techniques or the discipline, as opposed to the output (i.e. the "intelligence"). I, and I'd argue most people, use AI to refer to the latter, but I guess the distinction between the discipline and the product is vague enough for media hype.
And to be clear, I'm not trying to take away from the authors. Optimization is one of the tools I like to throw around, both in my own projects and professionally. I love seeing cool applications of optimization, and this definitely qualifies. I just don't agree that everything that uses optimization is AI, because it's an unnecessary blurring of boundaries.
> You don't get to rewrite decades of mathematical literature because it gives you AI vibes.
AI as a term was invented to describe exactly this. Any usage of the term AI which does not include this is a misunderstanding of the term. You don't get to rewrite decades of computer science literature because it fails to give you AI vibes.
> Most people would refer to and understand AI as being the thing they interact with. You can't interact with an optimizer, but you can interact with the function that is being optimized.
I have no idea what you mean by "interact with" in this context. You can use a non AI optimizer to train an AI. You can also create an AI that serves the function of an optimizer. Optimization is a task, artificial intelligence is an approach to tasks. A neural network trained to optimize chip design is exactly as much an AI as a neural network trained to predict protein folding or translate speech.
There's no function here that is analogous to a decision tree, or a parametrized model, just an optimizer and a loss function with a simulator. This isn't AI in the way it's commonly understood, which is the function that takes an input and produces a learned output.
An optimizer produces a single optimized set of parameters. AI is a (usually parametrized) function mapping a collection of input states to a collection of output states. The function is the AI, not the optimizer. I'd suggest anyone who thinks otherwise go and do some basic reading.
Sigh, another day, another post I must copy paste my bookmarked Wikipedia entry for:
> "The AI effect" refers to a phenomenon where either the definition of AI or the concept of intelligence is adjusted to exclude capabilities that AI systems have mastered. This often manifests as tasks that AI can now perform successfully no longer being considered part of AI, or as the notion of intelligence itself being redefined to exclude AI achievements.[4][2][1] Edward Geist credits John McCarthy for coining the term "AI effect" to describe this phenomenon.[4]
> McCorduck calls it an "odd paradox" that "practical AI successes, computational programs that actually achieved intelligent behavior were soon assimilated into whatever application domain they were found to be useful in, and became silent partners alongside other problem-solving approaches, which left AI researchers to deal only with the 'failures', the tough nuts that couldn't yet be cracked."[5] It is an example of moving the goalposts.[6]
Prior to 2021/202-whenever, most sensible people called this stuff deep learning / machine learning etc. For over 15+ years it’s been called machine learning — “getting machines to complete tasks without being explicitly programmed to do so”.
since 2021/whenever LLM applications got popular everyone has been mentioning AI. this happened before during the previous mini-hype cycle around 2016-ish where everyone was claiming neural networks were “AI”. even though, historically, they were still referred to by academics as machine learning.
no-one serious, who actually works on these things; isn’t interested in making hoardes of $$$ or getting popular on social media, calls this stuff AI. so if there were a wikipedia link one might want to include on this thread, I’d say it would be this one — https://en.m.wikipedia.org/wiki/Advertising
because, let’s face it, advertising/marketing teams selling products using linear regression as “AI” are the ones shifting the definition into utter meaninglessness.
so it’s no surprise people on HN, some of whom actually know stuff about things, would be frustrated and annoyed and get tetchy about calling things “AI” (when it isn’t) after 3 sodding years of this hype cycle. i was sick of it after a month. imagine how i feel!
Machine learning is a subfield of AI. Complaining about calling ML AI is like complaining about calling Serena Williams an "athlete" because she's actually a "tennis player"
You've missed the point I was making it seems, so I'll condense and focus down on it.
The reason why the "AI" goalposts always seem to shift -- is not because people suddenly decide to change the definition, but because the definition gets watered down by advertising people etc. Most people who know anything call this stuff deep learning/machine learning to avoid that specific problem.
Personally, I can't wait for people who work in advertising to get put on the same spaceship as the marketers and telephone sanitizers. (It's not just people in advertising. i just don't like advertising people in particular).
--
I'd argue machine learning is actually a sub-field within statistics. but then we're gonna get into splitting hairs about whether Serena Williams is an athlete, or a professional sports player. which wasn't really the point I was making and isn't actually that important. (also, it can be a sub-field of both, so then neither of us is wrong, or right. isn't language fun!).
On the contrary. The "AI effect" is an example of attempting to hold others to goalposts that they never agreed to in the first place.
Instead of saying "this is AI and if you don't agree then you're shifting the goalposts" instead try asking others "what future developments would you consider to be AI" and see what sort of answers you get.
People did ask that, and they got back answers like "beating grandmasters at chess" and "being able to hold a conversation with a human," but no one considers chess engines or chatbots to be AI anymore because the goal posts were moved.
I would dispute that. I consider both of those examples to be AI, but not general AI and not particularly strong AI.
Meanwhile I do not consider gradient descent (or biased random walk, or any number of other algorithms) to be AI.
The exact line is fuzzy. I don't feel like most simple image classifiers qualify, whereas style transfer GANs do feel like a very weak form of AI to me. But obviously it's becoming quite subjective at that point.
Is this really so novel? Engineers have been using evolutionary algorithms to create antennas and other components since the early 2000s at least. I remember watching a FOSDEM presentation on an 'evolved' DSP for radios in the 2010s.
I don't believe it's comparable. Yes, we've used algorithms to find "weird shapes that work" for a long time, but they've always been very testable. AI is being used for more complex constructs that have exponentially-exponentially greater testable surface area (like programs and microarch).
Yes, for low-frequency analog circuits these experiments go back to the 1990s at least.
J. R. Koza, F. H Bennett, D. Andre, M. A. Keane, and F. Dunlap,
“Automated synthesis of analog electrical circuits by means of genetic
programming,” IEEE Trans. Evol. Comput., vol. 1, pp. 109–128, July
1997.
https://dl.acm.org/doi/10.1109/4235.687879
This is really interesting and I’m surprised I’ve never even heard of it before.
Now I’m imagining antennas breeding and producing cute little baby antennas that (provided they’re healthy enough) survive to go on to produce more baby antennas with similar characteristics, and so on…
It’s a weird feeling to look at that NASA spacecraft antenna, knowing that it’s the product of an evolutionary process in the genuine, usual sense. It’s the closest we can get to looking at an alien. For now.
These are highly complicated pieces of equipment almost as complicated as living organisms.
ln some cases, they've been designed by other computers.
We don't know exactly how they work.
This comment (not mine) from the article is absolute Gold:
> "Not only did the chip designs prove more efficient, the AI took a radically different approach — one that a human circuit designer would have been highly unlikely to devise."
> That is simply not true... more likely, a human circuit designer would not be allowed to present a radical new design paradigm to his/her superiors and other lead engineers. (a la Edison, Westinghouse, Tesla, Da Vinci, et-al.)
> AI models have, within hours, created more efficient wireless chips through deep learning, but it is unclear how their 'randomly shaped' designs were produced.
IIRC this was also tried at NASA, they used some "classic" genetic algorithm to create the "perfect" antenna for some applications, and it looked unlike anything previously designed by engineers, but it outperformed the "normal" shapes. Cool to see deep learning applied to chip design as well.
Wasn't there an GA FPGA design to distinguish two tones that was so weird and specific not only did it use capacitance for part on its work but literally couldn't work on another chip of the same model?
Yes, indeed, although the exact reference escapes me for the moment.
What I found absolutely amazing when reading about this, is that this is exactly how I always imagined things in nature evolving.
Biology is mostly just messy physics where everything happens at the same time across many levels of time and space, and a complex system that has evolved naturally appears to always contain these super weird specific cross-functional hacks that somehow end up working super well towards some goal
I've only started to look into the complexities involved in chip design (for my BitGrid hobby horse project) but I've noticed that in the Nature article, all of the discussion is based on simulation, not an actual chip.
Let's see how well that chip does if made by the fab. (I doubt they'd actually make it, likely there are a thousand design rule checks it would fail)
If you paid them to over-ride the rules at make it anyway, I'd like to see if it turned out to be anything other than a short-circuit from Power to Ground.
They do have some measurement results in figures 6 and 7. Looks like they didn't nail the center frequencies but at mmWave it's reasonable for a first attempt -- they're still missing something in their model though, same as if you did it by hand.
I'm skeptical that these pixelated structures are going to turn out anything better than the canonical shapes. They look cool but may just be "weird EM tricks", deconstructing what doesn't really need to be. Anyone remember the craze for fractal antennas?
Our human designs strive to work in many environmental conditions. Many early AI designs, if iterated in the real world, would incorporate local physical conditions into their circuits. For example, that fluorescent lamp or fan I'm picking up(from the AI/evolutionary design algorithm's perspective) has great EM waves that could serve as a reliable clock source, eliminating the need for my own. Thus if you move things it would break.
I am sure there are analogous problems in the digital simulation domain. Without thorough oversight and testing through multiple power cycles, it's difficult to predict how well the circuit will function, and how incorporating feedback into the program will affect its direction, if not careful, causing the aforementioned strange problems.
Although the article mentions corrections to the designs, what may be truly needed is more constraints. The better we define these constraints, the more likely correctness will emerge on its own.
> Our human designs strive to work in many environmental conditions. Many early AI designs, if iterated in the real world, would incorporate local physical conditions into their circuits. For example, that fluorescent lamp or fan I'm picking up(from the AI/evolutionary design algorithm's perspective) has great EM waves that could serve as a reliable clock source, eliminating the need for my own. Thus if you move things it would break.
This problem may have a relatively simple fix: have two FPGAs – from different manufacturing lots, maybe even different models or brands – each in a different physical location, maybe even on different continents. If the AI or evolutionary algorithm has to evolve something that works on both FPGAs, it will naturally avoid purely local stuff which works on one and not the other, and produce a much more general solution.
This is similar to why increasing the batch size during LLM training results in better performance: you force the optimizer to generalize to a larger set.
Ask the same "AI" to create a machine readable proof of correctness. Or even better - start from an inefficient but known to be working system, and only let the "AI" apply correctness-preserving transformations.
I don’t think it’s that easy. I’m sure Intel, AMD and Apple have a very sophisticated suite of “known working systems” that they use to test their new chips, and they still build in bugs that security researchers find 5 years later. It’s impossible to test and verify such complex designs fully.
It's a little different in software. If I'm writing a varint decoder and find that it works for the smallest and largest 65k inputs, it's exceedingly unlikely that I'll have written a bug that somehow affects only some middling number of loop iterations yet somehow handles those already tested transitions between loop iteration counts just fine.
For a system you completely don't understand, especially when the prior work on such systems suggests a propensity for extremely hairy bugs, spot-checking the edge cases doesn't suffice.
And, IMO, bugs are usually much worse the lower down in the stack they appear. A bug in the UI layer of some webapp has an impact and time to fix in proportion to that bug and only that bug. Issues in your database driver are insidious, resulting in an unstable system that's hard to understand and potentially resulting in countless hours fixing or working around that bug (if you ever find it). Bugs in the raw silicon that, e.g., only affect 1 pair of 32-bit inputs (in, say, addition) are even worse. They'll be hit in the real world eventually, and they're not going to be easy to handle, but it's simultaneously not usually practical to sweep a 64-bit input space (certainly not for every chip, if the bug is from analog mistakes in the chip's EM properties).
Literally no piece of software is bug-free. Not one. What are you talking about? Of course it’s impossible to test all inputs, because there’s going to be inputs that you can’t even convince of at the time of designing. What if your application suddenly runs at 1000000x the intended speed because hardware improves so much? How do you test for that?
Yes it does. It ages. But even if it doesn't, my point still stands. Or are you insinuating that the engineers over at Intel, AMD and Apple don't know what they're doing, because clearly their CPUs aren't flawless and still have bugs, like Spectre/Meltdown.
It deteriorates, it doesn't change. The functionality is still there and no modern hardware deteriorates to a failing state before it gets obsolete. Yes, I am insinuating that the engineers at intel, AMD, apple and nvidia are incentivized to prioritize expedient solutions over developing more robust architectures, as evidenced by vulnerabilities like Spectre and Meltdown.
Pieces like this remind me that even professors need to sell what they do, like saying "Humans cannot really understand them." in this case. Never have we ever had more simulation tools and compute power like we have today and we can't understand how these chips really work?
I think this is an example of mystifying-for-marketing as used in academia, like portraying this research as some breakthrough at a level that exceeds human understanding. IMHO practitioners of science should be expected to do better than this.
It's not necessarily the professor really saying that. Journalists (and university press offices) like to have such lines in pop science articles, and how it goes is that there's an interview from which the writer "interprets" some quotes. These are typically sent to the interviewee to check, but many don't bother to push back so much of it's not egregiously bad.
I’ve never been able to put it into words, but when we think about engineering in almost any discipline, a significant amount of effort goes into making things buildable by different groups of people. We modularize components or code so that different groups can specialize in isolated segments.
I always imagined if you could have some super mind build an entire complex system, it would find better solutions that got around limitations introduced by the need to make engineering accessible to humans.
An "optimal" solution may do away with "wasteful" abstraction of interfaces and come up with something more efficient. But there is wisdom in narrow interfaces and abstractions. Structure helps to evolve over time which at least for now most computer optimization focuses on getting the best solution now.
I think it’s half-guess and half-hope but I imagine we’ll spend centuries building really dumb mechanism, then suddenly be completely left in the dust intellectual. I guess that’s what you’d call the singularity. I don’t know if that hypermind will bother designing circuits for us.
Thought tiny wireless antennas were already dark magic that people barely understood anyway was more trial and error. Feels like yet another so called science publication doing a clickbait headline.
As I kid I played a competitive text based strategy game, and I made my own crude simulation that randomly tried different strategies. I let the simulation run for a few days with billions of iterations, and it came up with a very good gameplay strategy. I went from being ranked below 1000 to top 10 using that strategy.
I also wrote programs that simulated classic game shows like the 3 doors, where you either stay with one door or change door. After running the simulation one million time it ended up with 66% chance of winning if you changed door. The teacher of course didn't believe me as it was too hard a problem for a highscooler to solve, but many years later I got it confirmed by a math professor that prooved it.
Computers are so fast that you don't really need AI learning to iterate, just run a simulation randomly and you will eventually end up with something very good.
I think this might be a use case for quantum computers, so if you have a quantum computer I'm interested to work with you.
I think it's pure AI hype to claim these are beyond human understanding, and I doubt that's what the professor really meant. There's real physical processes going on, and we can study them carefully to eventually learn how they work. We just don't understand them yet.
It's religion that claims reality is beyond human understanding, it's not something scientists should be doing.
Some software I inherited from my predecessor is already like this.
When I got it, one part of it was a single Perl file with about 5k lines of code, with 20+ variables visible in the whole file, with 10+ levels of nested loops, basically all of them with seemingly random "next LABEL" and "last LABEL" statements, which are basically slightly-constrained GOTOs. Oh, and the variable names very mostly meaningless to me (one or two letters).
This was only a small part of my job, over the years I've managed to reduce this mess, broke out some parts into smaller functions, reduced the scope of some variables etc. but a core remains that I still don't really understand. There's some mental model deep in the original programmer's mind that I simply cannot seem to grasp and that the code structure is based on.
(We're now replacing this whole thing by cleaner re-implementation, with unit tests, a less idiosyncratic structure, and more maintainers).
Now imagine what it must feel like if the original programmer wasn't human, but some alien mind that we're even further from understanding.
It’s really been advertised heavily lately but I just discovered it a couple weeks ago, and in case you’re unaware the real aha moment with Cursor for me was Composer in Agent mode with Sonnet 3.5.
If you want the highest chance of success, use a reasoning model (o3-mini high, o1 pro, r1, grok 3 thinking mode) to create a detailed outline of how to implement the feature you want, then copy paste that into composer.
It one shots a lot of greenfield stuff.
If you get stuck in a loop on an issue, this prompt I got from twitter tends to work quite well to get you unstuck: "Reflect on 5-7 different possible sources of the problem, distill those down to 1-2 most likely sources, and then add logs to validate your assumptions before we move onto implementing the actual code fix."
Just doing the above gets me through 95% of stuff I try, and then occasionally hopping back out to a reasoning model with the current state of the code, errors, and logs gets me through the last 5%.
Just last night I took a similar approach to arriving a number of paths to take when I shared my desired output with a knowledge graph that I had populated and asked the AI to fill in the blank about the activities that would lead a user to my desired output. it worked! I got a few none-corralative gaps that came up as well and after some fine tuning, got included in the graph to enrich the contentious output.
I feel this is a similar approach and it's our job to populate and understand the gaps in between if we are trying to understand how these relationships came to existence. a visual mind map of the nodes and the entire network is a big help for a visual learner like myself to see the context of LLMs better.
anyway, the tool I used is InfraNodus and am curious if this community is aware of it, I may have even discovered it on HN actually.
> The AI also considers each chip as a single artifact, rather than a collection of existing elements that need to be combined. This means that established chip design templates, the ones that no one understands but probably hide inefficiencies, are cast aside.
there should be a word for this process of making components efficiently work together, like 'optimization' for example
This is a strange distinction for the article to point out. If you want to take a more modular approach all you have to do is modify the loss function to account for that. It's entirely arbitrary.
And the fact that humans "cannot understand it" means that it's likely overfitted to the job. If you want to make slight modifications to the design, you'll likely have to run the AI tool over again and get a completely new design, because there's zero modularity.
I wonder about security of such designed chips. We've been demonstrated that apparently optimal architecture can lead to huge errors that create security flaws (spectre, Pacman for M1 etc).
"Although the findings suggest that the design of such complex chips could be handed over to AI, Sengputa was keen to point out that pitfalls remain “that still require human designers to correct.” In particular, many of the designs produced by the algorithm did not work– equivalent to the "hallucinations" produced by current generative AI tools."
Vast chunks of engineering are going to be devalued in the next 10-15 years, across all disciplines. It's already enabling enormous productivity gains in software, and there's zero reason this can't translate to other areas. I don't see any barrier to transformers being able to write code-cad for a crankshaft or a compressor, for example, other than the fact that so far they haven't been trained to do so. Given the extent to which every industry uses software for design, there's nothing to really stop the creation of wrappers and the automation of those tasks. In fact, proprietary kernels aren't even a barrier, because the gains in productivity make building a competitor easier than ever before.
I certainly disagree that it's enabling enormous productivity gains in software. It's a productivity loss to have a tool whose output you have to check yourself every time (because you can't trust it to work reliably).
When I was studying, I implemented a flight dynamics simulation from scratch, partly as a learning exercise, and partly so that I could have greater control over the experiments I wanted to run. The trickiest part of this was the rotations between the local and inertial frames, which took the better part of a week for me to figure out (especially the time derivative of the quaternion).
On a lark, I asked Deep Seek to implement the relevant functions yesterday, and it spat them out. Not only were they correct, they came with a very good low level description of what the code was doing, and why -- i.e. all of the stuff my head was against the desk for while I was figuring it out.
If I wanted to implement, say, an EKF tomorrow, I have zero doubts that I could do it on my own if I had to, but I'm also 99% sure Deep Seek could just spit it out and I'd only have to check it and test it. It's not a substitute for understanding, and knowing the right questions to ask, but it is tremendously powerful. For the stuff I'm usually doing, which is typically mathematically demanding, and for which implementation can often be harder than checking an existing implementation is correct, it's a tremendous productivity gain.
I mean all of the complex operations research optimal solutions are not graspable by human brain. See a complex travelling salesman solution with delivery time windows and your head will spin, you will be wondering how come that solution is optimal. But then you try your rational heuristic and it sucks compared to the real optimal.
The same comments were made about John Koza's results with Genetic Programming. However, there are some obvious differences between the current model based techniques and Genetic Algorithms. Some feel that the path to AGI will necessarily include a GA component.
In a sense, Adrian Thompson kicked this off in the 90's when he applied an evolutionary algorithm to FPGA hardware. Using a "survival of the fittest" approach, he taught a board to discern the difference between a 1kHz and 10KHz tone.
The final generation of the circuit was more compact than anything a human engineer would ever come up with (reducible to a mere 37 logic gates), and utilized all kinds of physical nuances specific to the chip it evolved on - including feedback loops, EMI effects between unconnected logic units, and (if I recall) operating transistors outside their saturation region.
Article: https://www.damninteresting.com/on-the-origin-of-circuits/
Paper: https://www.researchgate.net/publication/2737441_An_Evolved_...
Reddit: https://www.reddit.com/r/MachineLearning/comments/2t5ozk/wha...
Related. Others?
The origin of circuits (2007) - https://news.ycombinator.com/item?id=18099226 - Sept 2018 (25 comments)
On the Origin of Circuits: GA Exploits FPGA Batch to Solve Problem - https://news.ycombinator.com/item?id=17134600 - May 2018 (1 comment)
On the Origin of Circuits (2007) - https://news.ycombinator.com/item?id=9885558 - July 2015 (12 comments)
An evolved circuit, intrinsic in silicon, entwined with physics (1996) - https://news.ycombinator.com/item?id=8923902 - Jan 2015 (1 comment)
On the Origin of Circuits (2007) - https://news.ycombinator.com/item?id=8890167 - Jan 2015 (1 comment)
That's not a lot of discussion—we should have another thread about this sometime. If you want to submit it in (say) a week or two, email hn@ycombinator.com and we'll put it in the second-chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308), so it will get a random placement on HN's front page.
If you’re up for sharing, I’m curious to know approximately how many hours each week you spend working on HN. It seems like it would be an enormous amount of time, but I’m just guessing.
@dang has a neuralink implant directly feeding HN to his brain...
I don't count them so I'm afraid I don't know. The hours get sort of fractally sprayed across my days (and weeks).
Dang leads HN: https://www.newyorker.com/news/letter-from-silicon-valley/th...
I suspect cloning tech is out there and Dang(s) are one of the first successful iterations. I just dont get how there is seemingly no time off, no vacations, sick days etc. Talk about passion.
Other alternative is the image of pale guy with laptop on some beautiful beach or climbing some crazy peak. Same passion, just concentrated in 1 body.
Dang is the end product of an evolutionary algorithm.
Did something funky happen to the timestamps in this thread? I could've sworn I was reading it last night (~12h ago)
It looks like we put the thread in HN's second-chance pool (https://news.ycombinator.com/item?id=26998308), so it got re-upped and given a random slot on the frontpage.
The relativized timestamps are an artifact of the re-upping system. There are past explanations here: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que....
Sorry for the confusion! I know it's weird but the alternative turns out to be even more confusing and we've never figured out how to square that circle.
I think dang did something manual to push it back to the frontpage, and that reset the timestamps on everyone’s existing comments…
There is a comment here by me which says “2 hours ago”, I swear I wrote it longer ago than that - indeed, my threads page still says I wrote it 20 hours ago, so it is like part of the code knows when I really wrote it, another part now thinks I wrote it 18 hours later than I did…
Yes, the relativized timestamps only show on /news (i.e. the frontpage) and /item pages. You can always see the original timestamps on other pages, like /submitted, /from, or (as you say) /threads.
Edit: I checked the code and the actual list is:
Fascinating paper. Thanks for the ref.
Operating transistors outside the linear region (the saturated "on") on a billion+ scale is something that we as engineers and physicists haven't quite figured out, and I am hoping that this changes in future, especially with the advent of analog neuromorphic computing. The quadratic region (before the "on") is far more energy efficient and the non-linearity could actually help with computing, not unlike the activation function in an NN.
Of course, the modeling the nonlinear behavior is difficult. My prof would say for every coefficient in SPICE's transistor models, someone dedicated his entire PhD (and there are a lot of these coefficients!).
I haven't been in touch with the field since I moved up the stack (numerical analysis/ML) I would love to learn more if there has been recent progress in this field.
The machine learning model didn’t discover something that humans didn’t know about. It abused some functions specific to the chip that could not be repeated in production or even on other chips or other configurations of the same chip.
That is a common problem with fully free form machine learning solutions: They can stumble upon something that technically works in their training set, but any human who understood the full system would never actually use due to the other problems associated with it.
> The quadratic region (before the "on") is far more energy efficient
Take a look at the structure of something like CMOS and you’ll see why running transistors in anything other than “on” or “off” is definitely not energy efficient. In fact, the transitions are where the energy usage largely goes. We try to get through that transition period as rapidly as possible because minimal current flows when the transistors reach the on or off state.
There are other logic arrangements, but I don’t understand what you’re getting at by suggesting circuits would be more efficient. Are you referring to the reduced gate charge?
> Take a look at the structure of something like CMOS and you’ll see why running transistors in anything other than “on” or “off” is definitely not energy efficient. In fact, the transitions are where the energy usage largely goes. We try to get through that transition period as rapidly as possible because minimal current flows when the transistors reach the on or off state.
Sounds like you might be thinking of power electronic circuits rather than CMOS. In a CMOS logic circuit, current does not flow from Vdd to ground as long as either the p-type or the n-type transistor is fully switched off. The circuit under discussion was operated in subthreshold mode, in which one transistor in a complementary pair is partially switched on and the other is fully switched off. So it still only uses power during transitions, and the energy consumed in each transition is lower than in the normal mode because less voltage is switched at the transistor gate.
> In a CMOS logic circuit, current does not flow from Vdd to ground as long as either the p-type or the n-type transistor is fully switched off.
Right, but how do you get the transistor fully switched off? Think about what happens during the time when it’s transitioning between on and off.
You can run the transistors from the previous stage in a different part of the curve, but that’s not an isolated effect. Everything that impacts switching speed and reduces the current flowing to turn the next gate on or off will also impact power consumption.
There might be some theoretical optimization where the transistors are driven differently, but at what cost of extra silicon and how delicate is the balance between squeezing a little more efficiency and operating too close to the point where minor manufacturing changes can become outsized problems?
The previous poster was probably thinking about very low power analog circuits or extremely slow digital circuits (like those used in wrist watches), where the on-state of the MOS transistors is in the subthreshold conduction region (while the off state is the same off state as in any other CMOS circuits, ensuring a static power consumption determined only by leakage).
Such circuits are useful for something powered by a battery that must have a lifetime measured in years, but they cannot operate at high speeds.
Seems like this overfitting problem could have been trivially fixed by running it on more than one chip, no?
Unfortunately not. This is analogous to writing a C program that relied on undefined behavior on the specific architecture and CPU of your developer machine. It’s not portable.
The behavior could change from one manufacturing run to another. The behavior could disappear altogether in a future revision of the chip.
The behavior could even disappear if you change some other part of the design that then relocated the logic to a different set of cells on the chip. This was noted in the experiment where certain behavior depended on logic being placed in a specific location, generating certain timings.
If you rely on anything other than the behavior defined by the specifications, you’re at risk of it breaking. This is a problem with arriving at empirical solutions via guess and check, too.
Ideally you’d do everything in simulation rather than on-chip where possible. The simulator would only function in ways supported by the specifications of the chip without allowing undefined behavior.
>The behavior could change from one manufacturing run to another. The behavior could disappear altogether in a future revision of the chip.
That's the overfitting they were referring to. Relying on the individual behaviour is the overfit. Running on multiple chips (at learning time) reduces the benefit of using an improvement that is specific to one chip.
You are correct that simulation is the better solution, but you have to do more than just limit to the operating range of the components, you have to introduce variances similar to the specified production precision. If the simulator made assumptions that the behaviour of two similar components was absolutely identical to each other then within tolerance manufacturing errors could be magnified.
In other words, optimization algorithms in general are prone to overfitting. Fortunately there are techniques to deal with that. Thing is, once you find a solution that generalize better to different chips, it probably won't be as small as the solution found.
I'm having trouble understanding. Chips with very high transistor counts tend to use saturation/turn-off almost exclusively. Very little is done in the linear region because it burns a lot of power and it's less predictable.
> Operating transistors outside the linear region (the saturated "on")
Do fuzz pedals count?
To be fair, we know they work and basically how they work, but the sonic nuances can be very hard to predict from a schematic.
I believe neuromorphic spiking hardware will be the step to truly revolutionize the field of anthropod contagion issues.
Can’t tell if this is a joke or not
I came in already knowing what neuromorphic hardware is and I'm also unsure
joke I think, anthropod is probably another way of saying bugs/ants haha
*arthropod, as in "joint(ed) leg" (cf. arthritis), GP misspelled it. "Anthropod" would mean something like "human leg".
Oh christ you're right, they were actually being really funny. I was being super literal and imagined them being very excited about futuristic advances in giant isopod diagnosis and care
Yeah, anthropic bugs. The planet is infested with them.
Bug zapper
at last, something possibly more buggy than vibe coding!
My thoughts, exactly.
I really wish I still had the link, but there used to be a website that listed a bunch of times in which machine learning was used (mostly via reinforcement learning) to teach a computer how to play a video game and it ended up using perverse strategies that no human would do. Like exploiting weird glitches (https://www.youtube.com/watch?v=meE5aaRJ0Zs shows this with Q*bert)
Closest I've found to the old list I used to go to is this: https://heystacks.com/doc/186/specification-gaming-examples-...
In my thesis many years ago [0] I used EAs to build bicycle wheels. They were so annoyingly good at exploiting whatever idiosyncrasies in my wheel-simulator. Like, the first iterations of my simulator it managed to evolve wheels that would slowly oscillate due to floating point instability or something, and when applied forces to it would increase and increase until the whole simulator exploded and the recorded forces were all over the place, of course then out-competing any wheel in at least some objective dimension.
After fixing those bugs, I mostly struggled with it taunting me. Like building a wheel with all the spokes going from the hub and straight up to the rim. It of course would break down when rolling, but on the objective of "how much load can it handle on the bike" it again out-competed every other wheel, and thus was at the pareto-front of that objective and kept showing up through all my tests. Hated that guy, heh. I later changed it to test all wheels in at least 4 orientations, it would then still taunt me with wheels like (c) in this figure[1], exploiting that.
[0]: https://news.ycombinator.com/item?id=10410813 [1]: https://imgur.com/a/LsONTGc
My favorite example was a game of pong with the goal of staying alive as long as possible. One ML algo just paused the game and left it like that.
My favorite was the ML learning how to optimally make the lowest-impact landing in a flight simulator— it discovered that it could wrap the impact float value if the impact was high enough so instead of figuring out the optimal landing, it started figuring out the optimal path to the highest-impact crashes.
This comment ought to be higher up. Such a perfect summary of what I have struggled to understand, which is the “danger” of AI once we allow it to control things
And yes you can fix the bug but the bike wheel guy shows you there will always be another bug. We need a paper/proof that invents a process that can put an AI-supported (non human intervention) finite cap or limiter or something on the possible bug surface
There is an apocryphal story about AI:
Conglomerate developed an AI and vision system that you could hook up to your Anti-aircraft systems to eliminate any chance of friendly fire. DARPA and the Pentagon went wild, pushing the system through test so they could get to the live demonstration.
They hook up a live and load up dummy rounds system, fly a few friendly planes over and everything looks good however when they fly a captured Mig-21 over the system fails to respond. The Brass is upset and the engineers are all scratching their heads trying to figure out what is going on but as the sun sets the system lights up, trying to shoot down anything in the sky.
They quickly shut down the system and do a postmortem, in the review they find that all the training data for friendly planes are perfect weather, blue sky overflights and all the training data for the enemy are nighttime/ low light pictures. The AI determined that anything fling during the day is friendly and anything at night is terminate with extreme prejudiced.
we used synthetic data for training a (sort of) similar system. not gonna get into the exact specifics, but we didn't have a lot of images of one kind of failure use-case.
like they're just not that many pictures of this stuff. we needed hundreds, ideally thousands, and had, maybe, a dozen or so.
okay, so we'll get a couple of talented picture / design guys from the UI teams to come out and do a little photoshop of the images. take some of the existing ones, play with photoshop, make a couple of similar-but-not-quite-the-same ones, and then hack those in a few ways. load those into the ML and tell em they're targets and to flag on those, etc. etc.
took a week or two, no dramas, early results were promising. then it just started failing.
turns out we ran into issues with two (2) pixels, black pixels against a background of darker black shades, that the human eye basically didn't see or notice; these were artifacts from photoshopping, and then re-using parts of a previous image multiple times. the ML started determining that 51% or more of the photos had those 2 pixels in there, and that photos lacking those -- even when painfully obvious to the naked eye -- were fails.
like, zooming in at it directly you're like yea, okay, those pixels might be different, but otherwise you'd never see it. thankfully output highlighting flagged it reasonably quickly but still took 2-3 weeks to nail down the issue.
That likely is an urban legend. See https://gwern.net/tank
Is AI the danger, or is our inability to simplify a problem down to an objective function the problem?
If anything, AI could help by "understanding" the real objective, so we don't have to code these simplified goals that ML models end up gaming no?
Simplification is the problem here, arguably. Even a simple-sounding objective (say, a bicycle wheel that holds load the best) has at least one implicit assumption - it will be handled and used in the real world. Which means it'll be subject of sloppy handling and thermal spikes and weather and abuse and all kinds of things that are not just meeting the goal. Any of those cheesy AI designs, if you were to 3D-print/replicate them, they'd fall apart as you picked them up. So the problem seems to be, ML algorithm is getting too simple goal function - one lacking the "used in the real world" part.
I feel that a good first step would be to introduce some kind of random jitter into the simulation. Like, in case of the wheels, introduce road bumps, and perhaps start each run by simulating dropping the wheel from a short distance. This should quickly weed out "too clever" solutions - as long as the jitter is random enough, so RL won't pick up on it and start to exploit its non-randomness.
Speaking of road bumps: there is no such thing in reality as a perfectly flat road; if the wheel simulator is just rolling wheels on mathematically perfect roads, that's a big deviation from reality - precisely the kind that allows for "hacky" solutions that are not possible in the real world.
You would have to introduce jitter to every possible dimension, when the dimensions themselves are continually expanding (as illuminated by the bike wheel example).. the combination of jitter x dimensions leads to an undefined problem (AKA theory of everything) in exponential fashion
How would more AI help? "given this goal with these parameters, figure out if another AI will ever game it into eventual thermonuclear war. "
Feels halting problem-esque.
My point was that instead of blaming ML - or optimisation tools really - for gaming objective functions and coming up with non-solutions that do maximise reward, AI could instead be used to measure the reward/fitness of the solution.
So to the OP's example "optimise a bike wheel", technically an AI should be able to understand whether a proposed wheel is good or not, in a similar way to a human.
Humans don't simplify problems by reducing them to objective functions: we simplify them by reducing them to specific instances of abstract concepts. Human thought is fundamentally different to the alien processes of naïve optimising agents.
We do understand the "real objectives", and our inability to communicate this understanding to hill-climbing algorithms is a sign of the depth of our understanding. There's no reason to believe that anything we yet call "AI" is capable of translating our understanding into a form that, magically, makes the hill-climbing algorithm output the correct answer.
>simplify a problem down to an objective function
Yes, I have an intuition that this is NP hard though
Humans have the same vulnerability
https://en.wikipedia.org/wiki/Perverse_incentive
All these claims are like "programming is impossible because I typed in a program and it had a bug". Yes, everyone's first attempt at a reward function is hackable. So you have to tighten up the reward function to exclude solutions you don't want.
Ummm, I'm going to hold off on that FSD subscription for a bit longer...
Is that Learnfun/Playfun that tom7 made? That one paused just before losing on tetris and left it like that, because any other input would make it lose
No I want to say this was ~10 years ago. Happened to a university researcher IIRC.
Make no mistake most humans will exploit any glitches and bugs they can find for personal advantage in game. It’s just machines can exploit timing bugs better.
Some people are able to do frame perfect inputs semi consistently from what I understand. I don’t understand how, as my own performance is around hitting 100ms window once, every other time
Maybe they have better equipment?
If you're using a typical PC (or $deity forbid, a phone) with a typical consumer OS, there's several sources of variability between your controller and the visual feedback you receive from the game, each of which could randomly introduce delays on the order of milliseconds or more. That "randomly" here is the key phrase - lag itself is not a problem, the variability is.
Better equipment or not, frame-perfect input is just hard to do and I'm impressed with people being able to do it.
There’s a few very cool examples where someone recently used RL to solve trackmania, and ends up having to add all sorts of constraints/penalties to prevent extremely strange exploits/glitches that are discovered IIRC… been a while since I watched.
https://youtu.be/Dw3BZ6O_8LY?si=VUcJa_hfCxjZhhfR
https://youtu.be/NUl6QikjR04?si=DpZ-iqVdqjzahkwy
Well, in the case of the latter, there was a vaguely known glitch for driving on the nose that allowed for better speeds than possible on 4 wheels, but it would be completely uncontrollable to a human. He figured out how to break the problem down into steps that the NN could gradually learn piecewise, until he had cars racing around tracks while balancing on their nose.
It turned out to have learned to keep the car spinning on its nose for stability, and timing inputs to upset the spinning balance at the right moment to touch the ground with the tire to shoot off in a desired direction.
I think the overall lesson is that, to make useful machine learning, we must break our problems down into pieces small enough that an algorithm can truly "build up skills" and learn naturally, under the correct guidance.
One of the games that I stumbled across on Steam recently was "AI Learns to Drive" - https://store.steampowered.com/app/3312030/AI_Learns_To_Driv...
It's a neat toy (not really "useful" nor too much of a "game") for generating interest in how neural nets work.
I'm almost 100% sure this is the link you're looking for: https://docs.google.com/spreadsheets/d/e/2PACX-1vRPiprOaC3Hs...
Haha that was actually the same one I posted in my comment.
This was some old website. A coworker sent it to me on Hipchat at my previous job about 10 years ago. And finding anything online older than like 5 years is nearly impossible unless you have the exact URL on hand.
Oh sorry! I recognized the description but since I recalled mine being a Google Sheets link, I just went straight into search mode - and yep, it actually took me a bit to find.
For the model, the weird glitches are just another element of the game. As they can't reason, have no theory of world or even any real knowledge of what is doing, the model don't have the prior assumptions a human would have about how the game is supposed to be played.
If you think about it, even using the term "perverse" is a result of us antropomorphizing any object in the universe that does anything we believe is on the realm of things humans do.
Not quite what you're describing, but no one has yet linked the classic Tom7 series where he applies deep learning to classic NES games: https://youtu.be/xOCurBYI_gY
> using perverse strategies that no human would do
Of course we do use perverse strategies and glitches in adversarial multiplayer all the time.
Case in point chainsaw glitch, tumblebuffs, early hits and perfect blocks in Elden Ring
The recent nvidia AI cuda engineer was also similar: https://news.ycombinator.com/item?id=43113941
on youtube, codebullet remakes games so that he can try different AI techniques to beat them.
I've referenced this paper many times here; it's easily in my top 10 of papers I've ever read. It's one of those ones that, if you go into it blind, you have several "Oh no f'king way" moments.
The interesting thing to me now is... that research is very much a product of the right time. The specific Xilinx FPGA he was using was incredibly simple by today's standards and this is actually what allowed it to work so well. It was 5v, and from what I remember, the binary bitstream to program it was either completely documented, or he was able to easily generate the bitstreams by studying the output of the Xilinx router- in that era Xilinx had a manual PnR tool where you could physically draw how the blocks connected by hand if you wanted. All the blocks were the same and laid out physically how you'd expect. And the important part is that you couldn't brick the chip with an invalid binary bitstream programming. So if a generation made something wonky, it still configured the chip and ran it, no harm.
Most all, if not all modern FPGAs just cannot be programmed like this anymore. Just randomly mutating a bitstream would, at best, make an invalid binary that the chip just won't burn. Or, at worst, brick it.
I remember this paper being discussed in the novel "Science of Discworld" -- a super interesting book involving collaboration between a fiction author and some real world scientists -- where the fictional characters in the novel discover our universe and its rules ... I always thought there was some deep insight to be had about the universe within this paper. Now moreso I think the unexpectedness says something instead about the nature of engineering and control and human mechanisms for understanding these sorts of systems ... -- sort of by definition human engineering relies on linearized approximations to characterize the effects being manipulated -- so something which operates in modes far outside those models is basically inscrutable. I think that's kind of expected but the results still provoke the fascination to ponder the solutions super human engineering methods might yet find with the modern technical substrates.
Xe highly recommend the series! Xe keep going back to them for bedtime audio book listening. Chapters alternate between fact and fiction and the mix of intriguing narrative and drier but compelling academic talk help put xir otherwise overly busy mind to rest. In fact, xe bought softcover copies of two of them just last week.
The science is no longer cutting edge (some are over twenty years old) but the deeper principles hold and Discworld makes for an excellent foil to our own Roundworld, just as Sir Pratchett intended.
Indeed, the series says more about us as humans and our relationship to the universe than the universe itself and xe love that.
You're not doing whatever oppressed group you're representing a favor by using weird, alienating pronouns.
People don't necessarily choose their own pronouns based on how it will reflect on an oppressed group, and they don't necessarily intend to be representing a group when representing themselves.
IIRC the flip-side was that it was hideously specific to a particular model and batch of hardware, because it relied on something that would otherwise be considered a manufacturing flaw.
Not even one batch. It was specific to that exact one chip it was evolved on. Trying to move it to another chip of the same model would produce unreliable results.
There is actually a whole lot of variance between individual silicon chips, even two chips right next to each other on the wafer will preform slightly differently. They will all meet the spec on the datasheet, but datasheets always specify ranges, not exact values.
If I recall the original article, I believe it even went a step further. While running on the same chip it evolved on, if you unplugged the lamp that was in the closest outlet the chip the chip stopped working. It was really fascinating how environmentally specific it evolved.
That said, it seems like it would be very doable to first evolve a chip with the functionality you need in a single environment, then slowly vary parameters to evolve it to be more robust.
Or vice versa begin evolving the algorithm using a fitness function that is the average performance across 5 very different chips to ensure some robustness is built in from the beginning.
> slowly vary parameters to evolve it to be more robust
Injecting noise and other constraints (like forcing it place circuits in different parts of the device) are totally valid when it needs to evolve in-place.
For the most part, I think it would be better to run in a simulator where it can evolve against an abstract model, then it couldn't overfit to the specific device and environment. This doesn't work if the best simulator of the system is the system itself.
https://en.wikipedia.org/wiki/Robust_optimization
https://www2.isye.gatech.edu/~nemirovs/FullBookDec11.pdf
Robust Optimization https://www.youtube.com/watch?v=-tagu4Zy9Nk
Yeah, if you took it outside the temperature envelope of the lab it failed. I guess thermal expansion?
There were also a bunch of cells that had inputs, but no outputs. When you disconnected them... the circuit stopped working. Shades of "magic" and "more magic".
I've never worked with it, but I've had a fascination with GA/GP ever since this paper/the Tierra paper. I do wonder why it's such an attractive technique - simulated annealing or hill climbing just don't have the same appeal. It's the biological metaphor, I think.
long time ago, maybe in russian journal "Radio" ~198x, there was someone there describing that if one gets certain transistor from particular batch of particular factory/date, and connect it in whatever weird way, will make a full FM radio (or similar-complex-thing).. because they've wronged the yields. No idea how they had figured that out.
But mistakes aside, what would it be if the chips from the factory could learn / fine-tune how to work (better) , on the run..
AM radio can be "detected" with a semiconductor, so this kinda makes sense if you squint. If you can find it, someday, update this!
At my highschool, we had FM radio transmitter on the other side of street. Pretty often you could hear one of the stations in computer speakers in library, so FM radio can be detected by simple analog circuits.
Interestingly, radios used to be called transistors colloquially.
I remember talking about this with my friend and fellow EE grad Connor a few years ago. The chip's design really feels like a biological approach to electrical engineering, in the way that all of the layers we humans like to neatly organize our concepts into just get totally upended and messed with.
Biology also uses tons of redundancy and error correction that the generative algorithm approach lacks.
Though, the algorithm might plausibly evolve it if it were trained in a more hostile environment.
Yup, was coming here to basically say the same thing. Amazing innovations happen when you let a computer just do arbitrary optimization/hill climbing.
Now, you can impose additional constraints to the problem if you want to keep it using transistors properly or to not use EM side effects, etc.
This headline is mostly engagement bait as it is first nothing new and second, it is actually fully controllable.
I read the damn interesting post back when it came out and seeing the title of the post immediately led me to thinking of Thompson's post as well.
The interesting thing about this project is that it shouldn’t even be possible if the chip behaved as an abstract logical circuit since then it would simply implement a finite automation. You must abuse the underlying physics to make the logic gates behave like something else.
Abstract logical circuits are still leaky abstractions, for example hazards are possible.
https://en.wikipedia.org/wiki/Hazard_(logic)
Yes you can go even more abstract and ignore time flow completely and have pure boolean logic, but then it can't be practically implemented at all.
That's exactly what I thought of too when I saw the title.
Basically brute force + gradient descent.
I should have read the comments because I just spent 20 minutes trying to find the article in Discover magazine about this exact thing: https://www.discovermagazine.com/technology/evolving-a-consc...
The part that I always remember is that if they changed the temperature in the room by a couple degrees the chip would stop working.
Thompson is who I immediately thought of. Thanks for digging up the actual cite.
“More compact than anything a human engineer would ever come up with” … sounds more like they built an artificial Steve Wozniak
Reminds of disassembled executables, unintelligible to the untrained eye.
It's even more convoluted when also re-interpreted into C language.
Designs nobody would ever come up with, but equivalent and even with compiler tricks we'd not have known.
classic thank you! I've been trying to find this recently. I first heard about this in my genetic algorithms class more than 15 years ago.
And this is the kind of technology we use to decide if someone should get a loan, or if something is a human about to be run over by a car.
I think I'm going to simply climb up a tree and wait this one out.
What if it invented a new kind of human, or a different kind of running over?
A classic. What's old is new again
So, the future is reliance on undefined but reproducible behavior
Not sure that's working out well for democracy
Relying on nuances of the abstraction and undefined or variable characteristics sounds like a very very bad idea to me.
The one thing you generally want for circuits is reproducibility.
[dead]
I strongly dislike when people say AI when they actually mean optimizer. Calling the product of an optimizer “AI” is more defensible, you optimized an MLP and now it writes poetry. Fine. Is the chip itself the AI here? That’s the product of the optimizer. Or is it the 200 lines of code that defines a reward and iterates the traces?
Yesterday I used a novel AI technology known as “llvm” to remove dead code paths from my compiled programs.
Say no more. Here's $100 million to take this product to market.
> known as “llvm” to remove dead code paths
Large Language Vulture Model?
At the risk of responding to obvious satire...
https://en.wikipedia.org/wiki/LLVM
Released approx 20 years before ASGSI (Artificial Super General Super Intelligence)
At the risk of responding to obvious satire...
Isn't ASGSI just a marketing term, while ASSGSI is the one as smart as a human?
You may find it funny, but it's unclear if profile-guided optimization is permitted under typical corporate AI policies.
Optimization is near and dear to my heart (see username), but I think it’s fine to call optimization processes AI because they are in the classical sense.
Once a computer can do something, it no longer called AI but just an algorithm.
At least, that used to be the case before the current AI summer and hype.
If you have an agent in an environment, the program that controls its choices has pretty consistently been called AI even when it's a simple design.
But I'm skeptical of calling most optimizers AI.
Once a computer can do something, it's just an algorithm. LLMs can't really do anything right, so they're AI. ;)
> [...] the classical sense.
Which one? Fuzzy logic, ANNs, symbol processing, expert systems, ...?
It's always entertaining to watch the hype cycles. Hopefully this one will have a net positive impact on society.
Chapters 4, 6, 10, and 11 of Russel and Norvig’s Artificial Intelligence text all fit this bill.
AI is such an ill-defined word that it's very hard to say what it's definitely not.
Marvin Minsky -- father of classical AI -- pointed out that intelligence is a "suitcase word" [1] which can be stuffed with many different meanings.
[1] https://www.nature.com/articles/530282a
I think it is as follows: We call it AI nowadays as long as we cannot clearly easily show how to get to the result, which means the computer did something that seems intelligent to us for the moment. Once we can explain things and write down a concise algorithm, we hesitate to call it AI.
Basically, we call things AI, that we are too stupid to understand.
I think what’s really happened is we get output that’s close enough to normal communications to “feel” human. I could say it’s all a giant trick, which it kind of is, but we’ve also gotten to the point where the trick it also useful for many things that previously didn’t have a good solution.
Marketing is as marketing does.
> I strongly dislike when people say AI when they actually mean optimizer.
It probably uses a relatively simple hill climbing algorithm, but I would agree that it could still be classified as machine learning. AI is just the new, hip term for ML.
What? Quite the opposite. AI is the original and broader term, ML is a subset of AI. Deep Learning was the "hot" terminology around 2015-2018, and since 2022/Chatgpt, LLM has become the trendy word. Yes, people now talk about "AI" as well, but that term has always been there, and anytime some AI technique becomes talked about, the term AI gets thrown around a lot too.
(Note - I may have misunderstood your meaning btw, if so apologies!)
If you want grant and VC dollars, you’ll rebrand things as “AI”.
See also: Iced tea company rebranded to include “blockchain” in name, and stocks jumped.
https://www.cnbc.com/2017/12/21/long-island-iced-tea-micro-c...
The "OG" AI research, like the the era of Minsky's AI Lab at MIT in the 1970s, broke AI into a few sub-domains, of which optimization was one. So long before we used the term AI to describe an LLM-based chat bot, we used it to describe optimization algorithms like genetic algorithms, random forests, support vector machines, etc.
Is there anything we can even call AI that would be correct?
Things used to stop being called AI once they worked well, because AI was artificial human thought and those things weren't that.
Now they start being called AI, because AI is artificial human thought and those things are that.
What changed? Our perception of the meaning of "thought".
The chip is not called AI-chip but rather AI-designed chip. At least in the title.
My point is that it’s equally ridiculous to call either AI. If our chip here is not the AI then the AI has to be the optimizer. By extension that means AdamW is more of an AI than ChatGPT.
I don't understand. I learnt about optimizers, and genetic algorithms in my AI courses. There are lots of different things we call AI, from classical AI (algorithms for discrete and continuous search, planning, sat, Bayesian stuff, decision trees, etc.) to more contemporary deep learning, transformers, genAI etc. AI is a very very broad category of topics.
Optimization can be a tool used in the creation of AI. I'm taking issue with people who say their optimizer is an AI. We don't need to personify every technology that can be used to automate complex tasks. All that does is further dilute an already overloaded term.
I agree that the article is wrong in using the wording “the AI”. However, firstly the original publication [0] doesn’t mention AI at all, only deep-learning models, and neither do any of the quotes in the article. Secondly, it is customary to categorize the technology resulting from AI research as AI — just not as “an AI”. The former does not imply any personification. You can have algorithms that exhibit intelligence without them constituting any kind of personal identity.
[0] https://www.nature.com/articles/s41467-024-54178-1
It's not "an AI", it's AI as in artificial intelligence, the study of making machines do things that humans do.
A fairly simple set of if statements is AI (an "expert system" specifically).
AI is _not_ just talking movie robots.
You can remove the word 'an' if you're attributing some weird meaning to it, the point is still valid. Genetic algorithms and optimizers are usually in there to make AI algorithms, they aren't themselves AI algorithms.
And you have to be doing something rather specific with a pile of if statements for it to count as an expert system.
Who said it was 'an AI'? Do you understand what intelligence means? And what artificial means?
In game dev we've called a bunch of weighted If statements AI since the 80s. Sometimes they're not even weighted.
I think that's a bit different. The term is overloaded. There's "the machine is thinking" AI and then there's "this fairly primitive code controls an agent" AI. The former describes the technique while the latter describes the use case.
Clippy was an AI but he wasn't an AI.
Artificial intelligence, as others are using it here to cover a broad field of study or set of techniques. You seem to be objecting because the described product is not "an artificial intelligence", i.e. an artificial mind.
For some of us, your objection sounds as silly as if we were to tell some student they didn't use algebra, because what they wrote down isn't "an algebra".
You use optimization to train AI, but we usually refer to AI as being the parametrized function approximator that is optimized to fit the data, not the optimizer or loss function themselves.
This is "just" an optimizer being used in conjunction with a simulation, which we've been doing for a long, long time. It's cool, but it's not AI.
You use teachers to train humans, but that doesn't mean teachers can't also be humans.
This is totally irrelevant.
Optimization is a branch of mathematics concerned with optimization techniques, and the analysis and quality of possible solutions. An optimizer is an algorithm concerned with finding optima of functions. You don't get to rewrite decades of mathematical literature because it gives you AI vibes.
Yeah, you need an optimizer to train AI, but it's not the AI part. Most people would refer to and understand AI as being the thing they interact with. You can't interact with an optimizer, but you can interact with the function that is being optimized.
I'm honestly stunned that this is even a controversial position.
FWIW, I suspect there are more folks here with exposure to decades of computer science literature about AI than to comparable mathematics literature.
The CS literature has used AI to refer to nearly any advanced search algorithm, e.g. during the prior AI boom and bust cycle around symbolic AI. In this literature, it is idiomatic that AI techniques are the broad category of search and optimization techniques. There wasn't necessarily any "training" involved, as machine learning was considered part of the AI topic area but not its entirety.
Maybe I'm getting old and grumpy.
It's always been acknowledged that various disciplines had significant crossover, e.g. ML and operations research, but I've never seen anyone claim that optimization is AI until recently.
Ian Goodfellow's book is, what, 10 years old at this point? The fundamentals in that book cover all of ML from classical to deep learning, and pretty clearly enumerate the different components necessary to do ML, and there's no doubt that optimization is one of them. But to say that it is AI in the way that most people would probably understand it? It's a stretch, and hinges on whether you're using AI to refer to the collection of techniques or the discipline, as opposed to the output (i.e. the "intelligence"). I, and I'd argue most people, use AI to refer to the latter, but I guess the distinction between the discipline and the product is vague enough for media hype.
And to be clear, I'm not trying to take away from the authors. Optimization is one of the tools I like to throw around, both in my own projects and professionally. I love seeing cool applications of optimization, and this definitely qualifies. I just don't agree that everything that uses optimization is AI, because it's an unnecessary blurring of boundaries.
> You don't get to rewrite decades of mathematical literature because it gives you AI vibes.
AI as a term was invented to describe exactly this. Any usage of the term AI which does not include this is a misunderstanding of the term. You don't get to rewrite decades of computer science literature because it fails to give you AI vibes.
> Most people would refer to and understand AI as being the thing they interact with. You can't interact with an optimizer, but you can interact with the function that is being optimized.
I have no idea what you mean by "interact with" in this context. You can use a non AI optimizer to train an AI. You can also create an AI that serves the function of an optimizer. Optimization is a task, artificial intelligence is an approach to tasks. A neural network trained to optimize chip design is exactly as much an AI as a neural network trained to predict protein folding or translate speech.
Yes. It's the optimizer here that's called "AI" because AIs are optimizers - and so are humans. It's a matter of sophistication.
I don't understand what your gripe is. Both are AI. Even rudimentary decision trees are AI.
There's no function here that is analogous to a decision tree, or a parametrized model, just an optimizer and a loss function with a simulator. This isn't AI in the way it's commonly understood, which is the function that takes an input and produces a learned output.
The entire point of the thing is that it takes an input and produces an output. The output is the designed chip.
An optimizer produces a single optimized set of parameters. AI is a (usually parametrized) function mapping a collection of input states to a collection of output states. The function is the AI, not the optimizer. I'd suggest anyone who thinks otherwise go and do some basic reading.
Lets just call every turning complete system AI and be done with it.
Sigh, another day, another post I must copy paste my bookmarked Wikipedia entry for:
> "The AI effect" refers to a phenomenon where either the definition of AI or the concept of intelligence is adjusted to exclude capabilities that AI systems have mastered. This often manifests as tasks that AI can now perform successfully no longer being considered part of AI, or as the notion of intelligence itself being redefined to exclude AI achievements.[4][2][1] Edward Geist credits John McCarthy for coining the term "AI effect" to describe this phenomenon.[4]
> McCorduck calls it an "odd paradox" that "practical AI successes, computational programs that actually achieved intelligent behavior were soon assimilated into whatever application domain they were found to be useful in, and became silent partners alongside other problem-solving approaches, which left AI researchers to deal only with the 'failures', the tough nuts that couldn't yet be cracked."[5] It is an example of moving the goalposts.[6]
> Tesler's Theorem is:
> AI is whatever hasn't been done yet.
> — Larry Tesler
https://en.wikipedia.org/wiki/AI_effect
Prior to 2021/202-whenever, most sensible people called this stuff deep learning / machine learning etc. For over 15+ years it’s been called machine learning — “getting machines to complete tasks without being explicitly programmed to do so”.
since 2021/whenever LLM applications got popular everyone has been mentioning AI. this happened before during the previous mini-hype cycle around 2016-ish where everyone was claiming neural networks were “AI”. even though, historically, they were still referred to by academics as machine learning.
no-one serious, who actually works on these things; isn’t interested in making hoardes of $$$ or getting popular on social media, calls this stuff AI. so if there were a wikipedia link one might want to include on this thread, I’d say it would be this one — https://en.m.wikipedia.org/wiki/Advertising
because, let’s face it, advertising/marketing teams selling products using linear regression as “AI” are the ones shifting the definition into utter meaninglessness.
so it’s no surprise people on HN, some of whom actually know stuff about things, would be frustrated and annoyed and get tetchy about calling things “AI” (when it isn’t) after 3 sodding years of this hype cycle. i was sick of it after a month. imagine how i feel!
- edit, removed line breaks.
Machine learning is a subfield of AI. Complaining about calling ML AI is like complaining about calling Serena Williams an "athlete" because she's actually a "tennis player"
You've missed the point I was making it seems, so I'll condense and focus down on it.
The reason why the "AI" goalposts always seem to shift -- is not because people suddenly decide to change the definition, but because the definition gets watered down by advertising people etc. Most people who know anything call this stuff deep learning/machine learning to avoid that specific problem.
Personally, I can't wait for people who work in advertising to get put on the same spaceship as the marketers and telephone sanitizers. (It's not just people in advertising. i just don't like advertising people in particular).
--
I'd argue machine learning is actually a sub-field within statistics. but then we're gonna get into splitting hairs about whether Serena Williams is an athlete, or a professional sports player. which wasn't really the point I was making and isn't actually that important. (also, it can be a sub-field of both, so then neither of us is wrong, or right. isn't language fun!).
We'll never build true AI, just reach some point where we prove humans aren't really all that intelligent either
AI is when Einstein is your butler.
> It is an example of moving the goalposts.
On the contrary. The "AI effect" is an example of attempting to hold others to goalposts that they never agreed to in the first place.
Instead of saying "this is AI and if you don't agree then you're shifting the goalposts" instead try asking others "what future developments would you consider to be AI" and see what sort of answers you get.
People did ask that, and they got back answers like "beating grandmasters at chess" and "being able to hold a conversation with a human," but no one considers chess engines or chatbots to be AI anymore because the goal posts were moved.
I would dispute that. I consider both of those examples to be AI, but not general AI and not particularly strong AI.
Meanwhile I do not consider gradient descent (or biased random walk, or any number of other algorithms) to be AI.
The exact line is fuzzy. I don't feel like most simple image classifiers qualify, whereas style transfer GANs do feel like a very weak form of AI to me. But obviously it's becoming quite subjective at that point.
Yeah, that's moving the goalposts.
Is this really so novel? Engineers have been using evolutionary algorithms to create antennas and other components since the early 2000s at least. I remember watching a FOSDEM presentation on an 'evolved' DSP for radios in the 2010s.
https://en.wikipedia.org/wiki/Evolved_antenna
I don't believe it's comparable. Yes, we've used algorithms to find "weird shapes that work" for a long time, but they've always been very testable. AI is being used for more complex constructs that have exponentially-exponentially greater testable surface area (like programs and microarch).
Yes, for low-frequency analog circuits these experiments go back to the 1990s at least.
J. R. Koza, F. H Bennett, D. Andre, M. A. Keane, and F. Dunlap, “Automated synthesis of analog electrical circuits by means of genetic programming,” IEEE Trans. Evol. Comput., vol. 1, pp. 109–128, July 1997. https://dl.acm.org/doi/10.1109/4235.687879
This is really interesting and I’m surprised I’ve never even heard of it before.
Now I’m imagining antennas breeding and producing cute little baby antennas that (provided they’re healthy enough) survive to go on to produce more baby antennas with similar characteristics, and so on…
It’s a weird feeling to look at that NASA spacecraft antenna, knowing that it’s the product of an evolutionary process in the genuine, usual sense. It’s the closest we can get to looking at an alien. For now.
Two antennas get married. The wedding was ok but the reception was great!
Well done.
There was also the more relevant (defunct) Distributed Hardware Evolution Project from the University of Sussex, which was using genetic algorithms to evolve circuits: https://wiki.bc-team.org/index.php?title=Distributed_Hardwar...
Yes, it's nothing novel. But it is AI adjacent news, so it automagically becomes a headline.
Except outside of science fiction, it'll just be horribly broken once you put it to use in the real world
To be fair, most of the science fiction is about it being horribly broken or, at least, functioning in ways its human stewards did not intend.
yeah besides the first and last chapters, this is pretty much I, Robot in a nutshell
This may have been explored but how much different is this from natural phenomena which we have only theories for their behavior?
That is, people have worked with aspects of physics and horticulture to use long before understanding the science. Also, with varying success.
Could LLM-generated AI artifacts be thought of in similar lines?
that's basically what the movie is about.
Yul Brynner running around murdering humans.
Actual Article: https://www.nature.com/articles/s41467-024-54178-1#Fig1
This comment (not mine) from the article is absolute Gold:
> "Not only did the chip designs prove more efficient, the AI took a radically different approach — one that a human circuit designer would have been highly unlikely to devise."
> That is simply not true... more likely, a human circuit designer would not be allowed to present a radical new design paradigm to his/her superiors and other lead engineers. (a la Edison, Westinghouse, Tesla, Da Vinci, et-al.)
> AI models have, within hours, created more efficient wireless chips through deep learning, but it is unclear how their 'randomly shaped' designs were produced.
IIRC this was also tried at NASA, they used some "classic" genetic algorithm to create the "perfect" antenna for some applications, and it looked unlike anything previously designed by engineers, but it outperformed the "normal" shapes. Cool to see deep learning applied to chip design as well.
Here's the link to that antenna: https://samim.io/static/upload/Screen_Shot_2018-03-15_at_09....
My theory is that if aliens ever turn up in a ship, assuming it is visible, it will be a butt-ugly super optimized messy asymmetrical shape.
well you don't have to be streamlined in a vacuum
Wasn't there an GA FPGA design to distinguish two tones that was so weird and specific not only did it use capacitance for part on its work but literally couldn't work on another chip of the same model?
Yes, indeed, although the exact reference escapes me for the moment.
What I found absolutely amazing when reading about this, is that this is exactly how I always imagined things in nature evolving.
Biology is mostly just messy physics where everything happens at the same time across many levels of time and space, and a complex system that has evolved naturally appears to always contain these super weird specific cross-functional hacks that somehow end up working super well towards some goal
> Yes, indeed, although the exact reference escapes me for the moment.
It's mentioned in a sister comment: https://www.damninteresting.com/on-the-origin-of-circuits/
As I recall it didn’t even work from day to day due to variance in the power supply triggered by variance in the power grid.
They had to redo the experiment on simulated chips.
Yes. The work of Adrian Thompson at the University of Sussex.
https://scholar.google.com/citations?user=5UOUU7MAAAAJ&hl=en
I think it was that or a similar test where it would not even run on another part, just the single part it was evolved on.
I've only started to look into the complexities involved in chip design (for my BitGrid hobby horse project) but I've noticed that in the Nature article, all of the discussion is based on simulation, not an actual chip.
Let's see how well that chip does if made by the fab. (I doubt they'd actually make it, likely there are a thousand design rule checks it would fail)
If you paid them to over-ride the rules at make it anyway, I'd like to see if it turned out to be anything other than a short-circuit from Power to Ground.
They do have some measurement results in figures 6 and 7. Looks like they didn't nail the center frequencies but at mmWave it's reasonable for a first attempt -- they're still missing something in their model though, same as if you did it by hand.
I'm skeptical that these pixelated structures are going to turn out anything better than the canonical shapes. They look cool but may just be "weird EM tricks", deconstructing what doesn't really need to be. Anyone remember the craze for fractal antennas?
If we can't understand the designs, how rigorously can we really test them for correctness?
Our human designs strive to work in many environmental conditions. Many early AI designs, if iterated in the real world, would incorporate local physical conditions into their circuits. For example, that fluorescent lamp or fan I'm picking up(from the AI/evolutionary design algorithm's perspective) has great EM waves that could serve as a reliable clock source, eliminating the need for my own. Thus if you move things it would break.
I am sure there are analogous problems in the digital simulation domain. Without thorough oversight and testing through multiple power cycles, it's difficult to predict how well the circuit will function, and how incorporating feedback into the program will affect its direction, if not careful, causing the aforementioned strange problems.
Although the article mentions corrections to the designs, what may be truly needed is more constraints. The better we define these constraints, the more likely correctness will emerge on its own.
> Our human designs strive to work in many environmental conditions. Many early AI designs, if iterated in the real world, would incorporate local physical conditions into their circuits. For example, that fluorescent lamp or fan I'm picking up(from the AI/evolutionary design algorithm's perspective) has great EM waves that could serve as a reliable clock source, eliminating the need for my own. Thus if you move things it would break.
This problem may have a relatively simple fix: have two FPGAs – from different manufacturing lots, maybe even different models or brands – each in a different physical location, maybe even on different continents. If the AI or evolutionary algorithm has to evolve something that works on both FPGAs, it will naturally avoid purely local stuff which works on one and not the other, and produce a much more general solution.
And then you change temperatre/elevation/move it next to a router and it falls apart, because after all there is going to be something correlated.
Great, so use ten. Use a hundred. Spread them around. Put one on the ISS.
The problems just have to be uncorrelated.
> Put one on the ISS.
I can see it already: cloud provider offers orbital FPGAs for testing your AI hardware designs
This is similar to why increasing the batch size during LLM training results in better performance: you force the optimizer to generalize to a larger set.
Think about it in the context of LLMs, their internals and the way that we test them; we do our best at the time.
Ask the same "AI" to create a machine readable proof of correctness. Or even better - start from an inefficient but known to be working system, and only let the "AI" apply correctness-preserving transformations.
I don’t think it’s that easy. I’m sure Intel, AMD and Apple have a very sophisticated suite of “known working systems” that they use to test their new chips, and they still build in bugs that security researchers find 5 years later. It’s impossible to test and verify such complex designs fully.
Especially true if the computer design creates a highly coupled device that could be process sensitive.
Results?
Can you always test the entire input space? Only for a few applications.
I am really curious about how you test software...
It's a little different in software. If I'm writing a varint decoder and find that it works for the smallest and largest 65k inputs, it's exceedingly unlikely that I'll have written a bug that somehow affects only some middling number of loop iterations yet somehow handles those already tested transitions between loop iteration counts just fine.
For a system you completely don't understand, especially when the prior work on such systems suggests a propensity for extremely hairy bugs, spot-checking the edge cases doesn't suffice.
And, IMO, bugs are usually much worse the lower down in the stack they appear. A bug in the UI layer of some webapp has an impact and time to fix in proportion to that bug and only that bug. Issues in your database driver are insidious, resulting in an unstable system that's hard to understand and potentially resulting in countless hours fixing or working around that bug (if you ever find it). Bugs in the raw silicon that, e.g., only affect 1 pair of 32-bit inputs (in, say, addition) are even worse. They'll be hit in the real world eventually, and they're not going to be easy to handle, but it's simultaneously not usually practical to sweep a 64-bit input space (certainly not for every chip, if the bug is from analog mistakes in the chip's EM properties).
Literally no piece of software is bug-free. Not one. What are you talking about? Of course it’s impossible to test all inputs, because there’s going to be inputs that you can’t even convince of at the time of designing. What if your application suddenly runs at 1000000x the intended speed because hardware improves so much? How do you test for that?
Hardware doesn’t change over time…
Yes it does. It ages. But even if it doesn't, my point still stands. Or are you insinuating that the engineers over at Intel, AMD and Apple don't know what they're doing, because clearly their CPUs aren't flawless and still have bugs, like Spectre/Meltdown.
It deteriorates, it doesn't change. The functionality is still there and no modern hardware deteriorates to a failing state before it gets obsolete. Yes, I am insinuating that the engineers at intel, AMD, apple and nvidia are incentivized to prioritize expedient solutions over developing more robust architectures, as evidenced by vulnerabilities like Spectre and Meltdown.
print("No Bugs!")
Depending on the language, this simple code actually has a bug:
https://blog.sunfishcode.online/bugs-in-hello-world/
following classic TDD, use novel "AI" to write millions of test cases.
(forgive me, my fellow HNers...)
Were we ever doing that though?
Evolution seems to work at producing "designs" and there's no understanding there at all.
Pieces like this remind me that even professors need to sell what they do, like saying "Humans cannot really understand them." in this case. Never have we ever had more simulation tools and compute power like we have today and we can't understand how these chips really work?
I think this is an example of mystifying-for-marketing as used in academia, like portraying this research as some breakthrough at a level that exceeds human understanding. IMHO practitioners of science should be expected to do better than this.
It's not necessarily the professor really saying that. Journalists (and university press offices) like to have such lines in pop science articles, and how it goes is that there's an interview from which the writer "interprets" some quotes. These are typically sent to the interviewee to check, but many don't bother to push back so much of it's not egregiously bad.
I’ve never been able to put it into words, but when we think about engineering in almost any discipline, a significant amount of effort goes into making things buildable by different groups of people. We modularize components or code so that different groups can specialize in isolated segments.
I always imagined if you could have some super mind build an entire complex system, it would find better solutions that got around limitations introduced by the need to make engineering accessible to humans.
An "optimal" solution may do away with "wasteful" abstraction of interfaces and come up with something more efficient. But there is wisdom in narrow interfaces and abstractions. Structure helps to evolve over time which at least for now most computer optimization focuses on getting the best solution now.
I think it’s half-guess and half-hope but I imagine we’ll spend centuries building really dumb mechanism, then suddenly be completely left in the dust intellectual. I guess that’s what you’d call the singularity. I don’t know if that hypermind will bother designing circuits for us.
Doesn't need a supermind to prove this is possible. Mere mortals and simple compilers can inline functions and trade abstraction for performance.
Thought tiny wireless antennas were already dark magic that people barely understood anyway was more trial and error. Feels like yet another so called science publication doing a clickbait headline.
There's a great paper that collects a long list of anecdotes about computational evolution.
"The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities"
[1] https://direct.mit.edu/artl/article/26/2/274/93255
As I kid I played a competitive text based strategy game, and I made my own crude simulation that randomly tried different strategies. I let the simulation run for a few days with billions of iterations, and it came up with a very good gameplay strategy. I went from being ranked below 1000 to top 10 using that strategy.
I also wrote programs that simulated classic game shows like the 3 doors, where you either stay with one door or change door. After running the simulation one million time it ended up with 66% chance of winning if you changed door. The teacher of course didn't believe me as it was too hard a problem for a highscooler to solve, but many years later I got it confirmed by a math professor that prooved it.
Computers are so fast that you don't really need AI learning to iterate, just run a simulation randomly and you will eventually end up with something very good.
I think this might be a use case for quantum computers, so if you have a quantum computer I'm interested to work with you.
In general I also find middle school and high school math teachers woefully ignorant about Monte Carlo methods.
I think it's pure AI hype to claim these are beyond human understanding, and I doubt that's what the professor really meant. There's real physical processes going on, and we can study them carefully to eventually learn how they work. We just don't understand them yet.
It's religion that claims reality is beyond human understanding, it's not something scientists should be doing.
Its inevitable: software (and other systems) will also become like this.
One of the junior developers I worked with years ago wrote code that humans couldn't understand, maybe he was was just ahead of his time
Some software I inherited from my predecessor is already like this.
When I got it, one part of it was a single Perl file with about 5k lines of code, with 20+ variables visible in the whole file, with 10+ levels of nested loops, basically all of them with seemingly random "next LABEL" and "last LABEL" statements, which are basically slightly-constrained GOTOs. Oh, and the variable names very mostly meaningless to me (one or two letters).
This was only a small part of my job, over the years I've managed to reduce this mess, broke out some parts into smaller functions, reduced the scope of some variables etc. but a core remains that I still don't really understand. There's some mental model deep in the original programmer's mind that I simply cannot seem to grasp and that the code structure is based on.
(We're now replacing this whole thing by cleaner re-implementation, with unit tests, a less idiosyncratic structure, and more maintainers).
Now imagine what it must feel like if the original programmer wasn't human, but some alien mind that we're even further from understanding.
I've been using Cursor, it already is. I've found myself becoming merely a tester of the software rather than a writer of it, the more I use this IDE.
It’s a bit clunky still IMHO. Or you found a good tutorial to leverage it fully?
It’s really been advertised heavily lately but I just discovered it a couple weeks ago, and in case you’re unaware the real aha moment with Cursor for me was Composer in Agent mode with Sonnet 3.5.
If you want the highest chance of success, use a reasoning model (o3-mini high, o1 pro, r1, grok 3 thinking mode) to create a detailed outline of how to implement the feature you want, then copy paste that into composer.
It one shots a lot of greenfield stuff.
If you get stuck in a loop on an issue, this prompt I got from twitter tends to work quite well to get you unstuck: "Reflect on 5-7 different possible sources of the problem, distill those down to 1-2 most likely sources, and then add logs to validate your assumptions before we move onto implementing the actual code fix."
Just doing the above gets me through 95% of stuff I try, and then occasionally hopping back out to a reasoning model with the current state of the code, errors, and logs gets me through the last 5%.
And then it's pretty much game over.
It’s better we [democracies] ride and control the AI change of paradigm than just let someone else do it for us.
"Democracy" is just a chant now. It's supposed to somehow happen without votes, privacy, freedom of expression, or freedom of association.
Well, Democracy is still the least worst of all political systems!
But please: would you prefer something else?
The point is there is no difference except spelling.
Not game over: it’s just that Engineering will turn into Biology :D
Psychotherapy, rather, as the natural evolution of prompting.
That's an approach.
Just last night I took a similar approach to arriving a number of paths to take when I shared my desired output with a knowledge graph that I had populated and asked the AI to fill in the blank about the activities that would lead a user to my desired output. it worked! I got a few none-corralative gaps that came up as well and after some fine tuning, got included in the graph to enrich the contentious output.
I feel this is a similar approach and it's our job to populate and understand the gaps in between if we are trying to understand how these relationships came to existence. a visual mind map of the nodes and the entire network is a big help for a visual learner like myself to see the context of LLMs better.
anyway, the tool I used is InfraNodus and am curious if this community is aware of it, I may have even discovered it on HN actually.
I've seen junior code "so weird that humans cannot understand them".
Maybe we’re all just in someone’s evolutionary chip designer
> The AI also considers each chip as a single artifact, rather than a collection of existing elements that need to be combined. This means that established chip design templates, the ones that no one understands but probably hide inefficiencies, are cast aside.
there should be a word for this process of making components efficiently work together, like 'optimization' for example
This is a strange distinction for the article to point out. If you want to take a more modular approach all you have to do is modify the loss function to account for that. It's entirely arbitrary.
And the fact that humans "cannot understand it" means that it's likely overfitted to the job. If you want to make slight modifications to the design, you'll likely have to run the AI tool over again and get a completely new design, because there's zero modularity.
I wonder about security of such designed chips. We've been demonstrated that apparently optimal architecture can lead to huge errors that create security flaws (spectre, Pacman for M1 etc).
Also see wok done on topological optimization. Mechanical designs no human would design, but AI not required either, just numerical optimization.
All the way at the bottom after all the amazing claims "many of the designs produced by the algorithm did not work."
They make no mention of the kind of algorithm/model they used. I believe it was not an LLM, was it?
"Although the findings suggest that the design of such complex chips could be handed over to AI, Sengputa was keen to point out that pitfalls remain “that still require human designers to correct.” In particular, many of the designs produced by the algorithm did not work– equivalent to the "hallucinations" produced by current generative AI tools."
:-|
I'm sure AI produced code will be unintelligible to humans soon too.
When r we gonna see these in production and actually used?
"In particular, many of the designs produced by the algorithm did not work"
When I see something I don't understand I use AI to help me understand it.
Hey, some of us didn't understand regular chips anyway.
That's kind of stuff that really makes me excited about AI.
tool assisted speedrun produces unreadable spaghetti code
ship it
No wonder YC was looking for startups working in this field.
AI designed electronics and software will be security nightmare, at least in the beginning.
Judging by the code it outputs we need to because I have to constantly fix most code LLMs output.
Vast chunks of engineering are going to be devalued in the next 10-15 years, across all disciplines. It's already enabling enormous productivity gains in software, and there's zero reason this can't translate to other areas. I don't see any barrier to transformers being able to write code-cad for a crankshaft or a compressor, for example, other than the fact that so far they haven't been trained to do so. Given the extent to which every industry uses software for design, there's nothing to really stop the creation of wrappers and the automation of those tasks. In fact, proprietary kernels aren't even a barrier, because the gains in productivity make building a competitor easier than ever before.
I certainly disagree that it's enabling enormous productivity gains in software. It's a productivity loss to have a tool whose output you have to check yourself every time (because you can't trust it to work reliably).
When I was studying, I implemented a flight dynamics simulation from scratch, partly as a learning exercise, and partly so that I could have greater control over the experiments I wanted to run. The trickiest part of this was the rotations between the local and inertial frames, which took the better part of a week for me to figure out (especially the time derivative of the quaternion).
On a lark, I asked Deep Seek to implement the relevant functions yesterday, and it spat them out. Not only were they correct, they came with a very good low level description of what the code was doing, and why -- i.e. all of the stuff my head was against the desk for while I was figuring it out.
If I wanted to implement, say, an EKF tomorrow, I have zero doubts that I could do it on my own if I had to, but I'm also 99% sure Deep Seek could just spit it out and I'd only have to check it and test it. It's not a substitute for understanding, and knowing the right questions to ask, but it is tremendously powerful. For the stuff I'm usually doing, which is typically mathematically demanding, and for which implementation can often be harder than checking an existing implementation is correct, it's a tremendous productivity gain.
>that pitfalls remain “that still require human designers to correct.” In particular, many of the designs produced by the algorithm did not work
So? Nothing.
I mean all of the complex operations research optimal solutions are not graspable by human brain. See a complex travelling salesman solution with delivery time windows and your head will spin, you will be wondering how come that solution is optimal. But then you try your rational heuristic and it sucks compared to the real optimal.
Resistance is futile
But does it work correctly? That's the big problems with AI hallucinations. They're weird and don't work correctly.
Our vision for a Utopian state run by autonomous AI's included building a web browser. So we did so. It barely works.
Today's write-up: https://medium.com/@rviragh/our-new-ai-generated-browser-bar...
Didn't realize I have so much in common with AI designed chips.
The same comments were made about John Koza's results with Genetic Programming. However, there are some obvious differences between the current model based techniques and Genetic Algorithms. Some feel that the path to AGI will necessarily include a GA component.
https://www.genetic-programming.com/jkpdf/gecco2000antenna.p...
now we are talking! next level for sure.