• 2 Posts
  • 81 Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle




  • If you read the article it explains why the fact that it is an ordinary failure is a bad thing. Ordinary failures (like some one not installing some bolts) are not supposed to happen in high reliability systems like passenger aircraft. Failures tend to come through “extraordinary” failures where multiple factors line up in an over looked way in order to create an unexpected failure mode.

    A 10 year old could tell you not installing safety bolts where they are supposed to be would make things dangerous. The fact that that is how a potentially lethal failure happened is damming.







  • They have no fact repositories to rely on.

    They do not possess the ability to know what is and is not correct.

    They cannot check documentation or verify that a function or library or API endpoint exists, even though they will confidently create calls to them.

    These three are all just the same as asking a person about them, they might know or might not but they cant right there and then check. Yes LLMs due to their nature cannot access a region marked “C# methods” or whatever, but large models do have some of that information embedded in them, if they didnt they wouldnt get correct answers anywhere near as often as they do, which for large models and common languages/frameworks is most of the time. This is before getting into retrieval augmented generation where they do have access to repositories of fact.

    This is what I was complaining about in the original post I replied to, no-where have I or anyone else I’ve seen in this thread say you should rely on these models, just that they are a useful input. Yet relying on them and using them without verification is the position you and the other poster are arguing against.


  • They can be useful for exploration and learning, sure. But lots of people are literally just copy-pasting code from LLMs - They just do it via an “accept copilot suggestion” button instead of actual copy paste.

    Sure, people use all sorts of tools badly, that’s a problem with the user not the tool (generally, I would accept poor tool design can be a factor).

    I really dislike the statement of “LLMs dont know anything they are just statistical models” it’s such a thought terminating cliche that is either vacuous or wrong depending on which way you mean it. If you mean they have no information content that’s just factually wrong, clearly they do. If you mean they dont understand concepts in the same way as a person does, well yes but neither does google search and we have no problem using that as the start point of finding out about things. If you mean they can get answers wrong, its not like people are infallible either (who I assume you agree do know things).





  • In a Ted Talk in April, Li further explained the field of research her startup will work on advancing, which involves algorithms capable of realistically extrapolating images and text into three-dimensional environments and acting on those predictions, using a concept known as “spatial intelligence.” This could bolster work in various fields such as robotics, augmented reality, virtual reality, and computer vision. If these capabilities continue to advance in the ambitious ways Li plans, it has the potential to transform industries like healthcare and manufacturing.

    I mean that sounds a lot more interesting than 99% of the LLM work going on at the moment, and given that she lead the team that cracked the computer vision problem of recognising objects she has pedigree.



  • To put a bit of context on those, 50GWh is a single medium sized power station running for 2 days. To create something that is being used around 10 million times a day all over the world.

    At 10 million queries per day that puts the usage per query at 100-500 Wh, about the amount of energy used by leaving an old incandecent lightbulb on for an hour, or playing a demanding video game for about 20 minutes.

    As another comparison, In the USA alone around 12,000 GWh of energy is spent in burning gasoline in vehicles every single day. So Americans driving 1% less for a single day would save more energy than creating GPT4 and the world using it for a year.


  • I was responding to your general statement that python is slow and so there is no point in making it faster, I agree that removing the GIL wont do much to improve the execution speed for programs making heavy use of numpy or things calling outside it.

    That’s a bit suss too tbh. Did the C++ version use an existing library like Eigen too or did they implement everything from scratch?

    It was written entirely from scratch which is kind of my point, a well writen python program can outperform a naive c implementation and is vastly simpler to create.

    If you have the expertise and are willing to put in the effort you likely can squeze that extra bit of performance out by dropping to a lower level language, but for certain workloads you can get good performance out of python if you know what you are doing so calling it extremely slow and saying you have to move to another language if you care about performance is missleading.


  • Numpy is written in C.

    Python is written in C too, what’s your point? I’ve seen this argument a few times and I find it bizarre that “easily able to incorporate highly optimised Fortran and C numerical routines” is somehow portrayed as a point against python.

    Numpy is a defacto extension to the python standard that adds first class support for single type multi-dimensional arrays and functions for working on them. It is implemented in a mixture of python and c (about 60% python according to github) , interfaces with python’s c-api and links in specialist libraries for operations. You could write the same statement for parts of the python std-lib, is that also not python?

    Its hard to not understate just how much simpler development is in numpy compared to c++, in this example here the new python version was less than 50 lines and was developed in an afternoon, the c++ version was closing in on 1000 lines over 6 files.


  • Nope, if you’re working on large arrays of data you can get significant speed ups using well optimised BLAS functions that are vectorised (numpy) which beats out simply written c++ operating on each array element in turn. There’s also Numba which uses LLVM to jit compile a subset of python to get compiled performance, though I didnt go to that in this case.

    You could link the BLAS libraries to c++ but its significantly more work than just importing numpy from python.