The Current State of AI: The Best AI Tools on the Market
The Future State of AI: Biggest AI Trends to Watch Out For
It’s getting harder to keep up with artificial intelligence these days; it often seems like there are new developments happening every hour. AI may be in a constant state of flux, but one thing is certain: it’s only getting bigger and more commonplace. The AI industry is even expected to grow to $2 trillion by 2030, guaranteeing a technological transformation across every market and in our own lives.
To truly understand AI, you may want to brush up on its origins with our History of Artificial Intelligence blog. But as AI matures, it raises questions about its — and our — future, so we’re taking the time to examine where it is now and where we think it’s headed.
This blog features insights from David Hudson, our director of engineering and resident AI aficionado. His artificial intelligence work includes an AI-generated blog and an upcoming book about prompt engineering, both done in partnership with tech entrepreneur William Hurley. He also piloted the development of Unhuman, our collection of AI products, and Literally Anything, our text-to-web-app tool.
Read on to learn more about the latest in artificial intelligence news and trends.
“AI is exploding because we can ask it human questions now.”
- David Hudson, Big Human Director of Engineering
Artificial intelligence has always been something scientists, engineers, and researchers have explored throughout human history. However, the undeniable AI boom we’re experiencing today can be directly traced back to ChatGPT, a natural language processing tool launched in its beta stage by OpenAI in 2020. As the most sophisticated chatbot to date, ChatGPT elevated AI in a way that brings us just a little bit closer to the anthropomorphic tech we’ve seen in movies and TV shows. “AI is exploding because we can ask it human questions now,” David says. “It’s able to do more human things than before, which makes AI much more accessible.”
It’s important to remember, though, that ChatGPT isn’t sentient (yet). “It doesn’t really understand anything,” David notes. “It just figures out what the patterns are and gives you the next sequence of patterns.” After its launch, ChatGPT ignited a marathon AI race that both smaller companies and industry giants like Google and Microsoft are scrambling to win, anxiously hoping they’ll develop the next big thing. This race has brought in an influx of new AI technologies and, as David points out, “There are actually more AI tools than we can handle; we can't keep track of it fast enough.”
To get you up to speed on what’s happening now in artificial intelligence, here’s a rundown of the biggest — and best — AI releases in the last few years.
A large language model (LLM) is a type of artificial intelligence that has been trained using text data (articles, books, and various internet-based resources) to generate human-like responses to natural language prompts. LLMs use deep learning, a layered neural network that recognizes intricate patterns in pictures, text, sounds, and other data.
ChatGPT: As the chatbot that stirred the current AI frenzy, the ChatGPT we know is actually OpenAI’s third iteration. In March of 2023, the nonprofit AI research lab released GPT-4, a more advanced system with wider problem-solving capabilities.
Bard: Google launched the experimental AI chat service Bard in March of this year. Dubbed a “creative and helpful collaborator,” Bard functions similarly to ChatGPT, but, instead of pulling from a set database, it can scour the web in real time for the most up-to-date answers and research.
Bing: Microsoft has its own chatbot, Bing Chat, but its more notable artificial intelligence update was to its Bing search engine in February. As the first to use AI-powered search, Bing reviews results from across the internet to deliver summarized digests with the option to view more thorough answers.
Text-to-image models use prompt engineering, the process of creating and designing input data. This allows users to give a text description of what they’d like to see, with models then producing images that match the description. Prompts can be simple (“a dog”) or complex (“a dog frolicking in a field of wildflowers while wearing a party hat”). Some models even offer more creative freedom with negative prompts, a feature that lets users stipulate what they don’t want to see. While it creates original images, AI-generated art is an emerging genre that has sparked debates about human creativity and ethical AI applications.
DALL-E: Expanding on its work with ChatGPT, OpenAI rolled out DALL-E in early 2021 and then DALL-E 2 in April of 2022. DALL-E can quickly create original images and art in an assortment of styles, including still-life photography, Surrealism, Art Deco, and more.
Midjourney: The self-funded research lab Midjourney released its eponymous generative AI program last July. When compared to DALL-E, Midjourney is known for its ability to produce hyper-realistic, dream-like renderings.
Stable Diffusion: Launched in August of 2022 by Stability AI, an open-source generative artificial intelligence company, Stable Diffusion can be accessed through web browsers or installed on your own computer, allowing users to train it with their own data. Unlike DALL-E and Midjourney, Stable Diffusion’s source code is public.
As the inverse of text-to-image, image-to-text is an artificial intelligence model that can scan existing images and convert them into readable text descriptions. Since it produces detailed image captions, image-to-text is lauded as an impressive accessibility feature, enabling those who are visually impaired to experience AI-generated content.
Google Lens: Google was an early pioneer in image-to-text, releasing Google Lens in October of 2017. After users upload an image to Google Lens, the visual search tool identifies objects within the image and then populates relevant information about those objects. Along with recognizing plants, animals, and word problems, Google Lens can also copy and translate text into over 100 languages. Google Lens’ precursor was Google Vision, an application programming interface (API) with a range of vision detection features like image labeling and facial recognition.
Midjourney: In April of this year, Midjourney added a “describe” command to its platform, allowing users to generate text prompts based on any visual. While they give users the ability to learn new vocabulary and artistic movements, the prompts can’t precisely recreate an already existing image.
ChatGPT: Though it isn’t an image generation system, GPT-4 can now understand and produce text responses to image inputs (before this update, users had to integrate image-to-text plugins into ChatGPT’s API). The new mixed-format input feature can scan images and provide outputs that summarize photos, graphs, data sets, reports, and more.
Text-to-speech processes text and transforms it into an audio file with human-like speech. This AI capability can be used in assistive technology, voice assistants, audiobooks, and more, virtually eliminating the need to collect real human voice samples. Text-to-speech conversion has stretched past voice generation and into voice cloning, a trend that’s received both positive and negative attention. You may have seen or heard of a few deepfakes, like fans mimicking Taylor Swift’s voice to send themselves personalized messages or scammers pretending to be a loved one in need.
ElevenLabs: According to David, ElevenLabs was the “first good text-to-speech” and voice cloning software. Released in January of this year, the AI model promises “top quality spoken audio in any voice, style, and language;” it can even synthesize human intonations and inflections.
Voicebox: Just this month, Meta launched Voicebox. The versatile generative AI model can edit, sample, and style audio files. Along with six language capabilities, Voicebox can be trained to do other tasks through in-context learning.
Another variation of speech recognition software, speech-to-text can identify the spoken word and translate it into a written format through linguistic algorithms and machine learning. There are a few artificial intelligence models that can recognize and transcribe speech in real time, screening auditory signals and turning them into text using Unicode, a universal encoding standard that assigns unique numbers to every character across a variety of languages. Speech-to-text has been around for as long as we’ve had voice assistants; every time you say “Hey, Google” or “Hey, Alexa,” you’re using speech-to-text.
Whisper: In March of 2023, OpenAI announced the arrival of Whisper API, an open-source neural network that can transcribe several languages using audio files in almost every format (M4A, MP3, MP$, MPGA, WAV, and WEBM). Whisper was trained on 680,000 hours of multilingual data from across the internet, which helps it recognize accents, technical terminology, and background noise.
Nova: Deepgram, an AI transcription foundation dedicated to understanding human language, released Nova in April, calling it the “world’s most powerful speech-to-text model.” Its strength lies in its ability to deliver budget-friendly, quick, and accurate translations.
With text-to-video, artificial intelligence models synthesize and translate information from written materials into a video format in a way that’s quicker and more efficient than traditional video production. It’s one of the newer AI capabilities, with some users stringing together their own content on non-text-to-video-specific apps like Midjourney and Stable Diffusion. “There are a lot of text-to-video models out there, but there aren’t very many that do it well,” David says.
Runway ML: Applied artificial intelligence research company Runway actually had a beta version of Runway ML back in 2019, but it's still one of the best text-to-video models on the market today. Runway ML’s Gen-2 was launched in March of this year, boasting more realistic video capabilities that can be done in any artistic style.
Make-a-Video: Last September, Meta elaborated on its own generative technology research with Make-A-Video, an artificial intelligence system that uses text prompts and publicly available datasets to create high-quality videos filled with dynamic colors, characters, and landscapes. Make-a-Video can produce original videos from images or other existing videos.
“The only way artificial intelligence can safely move forward is to open-source LLMs.”
Throughout its history, artificial intelligence has always been finicky; it’s a rollercoaster filled with ups and downs. With new developments happening almost daily, AI’s future may not be something we can predict with 100% accuracy. However, there are a few trends and shifts in the industry — both good and bad — we’re keeping an eye on right now.
There’s a lot of talk about whether or not artificial intelligence will result in job loss; the truth is — and there’s no easy way to say this — it already has. Just last month, around 3,900 people were laid off, marking the first time AI was cited as the reason. All of these layoffs came from the tech industry, with several companies replacing customer support employees with their own chatbots. As artificial intelligence changes the way companies operate, it will, of course, change the jobs companies will need. It’s estimated that AI is capable of doing a quarter of all the work currently done by humans, foreshadowing even more layoffs down the road.
But there isn’t a cause for mass panic — yet. Artificial intelligence isn’t human, so it’s incapable of taking on responsibilities that need a human touch. There are still quite a few jobs that require human emotional intelligence, thinking, and problem-solving skills; workers in the creative, medical, and trade fields are likely to remain the safest. In the next coming years, though, AI will inevitably seep into parts of every job, transforming the types of tasks we normally do. David likens this to internship roles where the tasks are lower stakes: “AI is going to go from a really motivated intern to a really motivated entry-level person.”
As artificial intelligence grows, the way we use it will, too. It’s already integrated into most of the devices we use on a daily basis, so human interaction with artificial intelligence will only become more prevalent. And since we’re already asking chatbots a million questions, their ability to source information from every corner of the internet will give us something of a pseudo-search engine. “It’s getting better at outputting facts, so people are going to think about it the way they think about Google,” David predicts.
The open-source movement advocates for the free, widespread use of artificial intelligence technologies, believing AI should be accessible to everyone. Both public and private companies are ramping up their AI efforts, but with so many players in the game, open-source might be the best path in ensuring AI can be responsibly used and relied upon. “The only way artificial intelligence can safely move forward is to open-source LLMs,” David advises. “Open-source is the way to go because people aren’t going to trust a single entity.”
While this era of expedited artificial intelligence innovation is taking over the world, it’s also causing concerns about security and potentially harmful uses. Back in 2015, Stephen Hawking, Elon Musk, Stephen Wozniak, and more than 3,000 others famously signed an open letter to governments across the globe, calling for a ban on artificially intelligent weapons and AI warfare. Since AI can’t distinguish between fact and fiction when it pulls information from the internet (yet), there are also concerns about propaganda and misinformation.
For most of its history, artificial intelligence has been an unregulated industry, so the world’s governments are clamoring for accountability and safety policies. Proponents of this even include Sam Altman, the CEO of OpenAI. However, David doesn’t believe total AI regulation is feasible — at least not for privately held companies or open-source organizations. “No AI can be a neutral source; it’s not possible,” David says. “But I think people will use AI to fight that as well.”
Content
Product Design