ai | The ORACLE-BASE Blog

AI will automate the jobs you probably wouldn’t have bothered to do!

One of my friends is really into using LLMs to automate office processes. He has a few different accounts, including a business ChatGPT account which he’s making heavy use of. Unlike most of the tech bros who constantly talk rubbish about AI, he’s really enthusiastic about it because he has real world uses that are making a big difference for his company. I can’t go into specifics, as it’s confidential, but I do want to speak in general terms about it.

Data Cleansing

One of the processes he’s excited about is essentially data cleansing. His company has a lot of publicly available data about products, but the quality and quantity of that data varies considerably. What’s more, the data needs additional categorization.

He’s never considered himself a “developer”, he’s a management type, but he’s actually done quite a bit of development over the years. With the assistance of ChatGPT he’s written some Python to submit source data to ChatGPT, get it to clean up the data, fill in some of the missing bits and categorize it. The cleansed data is dumped out to files where it can be checked and loaded back into the system.

He’s had this data cleansing process running overnight for a few weeks and it’s chugging through his data and doing a great job so far. Because of the nature of the task, he doesn’t have to deal with hallucinations or security issues, which is nice. His estimate is it would have taken about 10 years for his current folks to do this manually.

There are other task he’s already automated, or is in the process of automating, but I want to focus on this one.

But would you have done it manually?

We often hear about AI taking people’s jobs, but I’m not always convinced that’s true, and here’s why.

When we were talking I asked if he would have done this data cleansing manually if he didn’t have access to ChatGPT (or some other LLM) to help him. His answer was probably not. They couldn’t really commit to a 10 year project, or hire 10 times the people to make it a 1 year project, so they probably would have just made do with crappy data.

This reminds me of another automation story

This reminds me of another story. I was chatting to someone who was waxing lyrical about some automation, and how it had saved so much time for one of his teams. What he didn’t know is I had been part of that automation process, and I explained to him that the team in question never used to bother to perform this task in the past. So technically it wasn’t saving them any time. The automation had allowed them to do something that they should have always been doing, but never bothered to before. 🙂

But what about the current job cuts?

I’m not saying no jobs will be lost because of AI, the same way I’m not saying no jobs will be lost because of automation. What I am convinced of is some work will end up being done by AI and automation that would never have got done without it.

You can argue this stopped new people being employed to do the work, but I suspect those jobs would never have happened in the first place.

I rarely see people saying their company needs to do less work. What they want to do is to get more work done without hiring more people. I think that is a very different scenario.

But there are massive job cuts going on in the tech industry because of AI right? Wrong. The big players invariably say they are making these cuts because of AI, but that’s a pile of crap. In many cases they are getting rid of the people they over-hired during the pandemic or thinning out roles they “think” are no longer needed.

So why say it’s because of AI? If you shed 10,000 staff that sounds like either your business is screwed, or you were hiring stupid numbers of people for no reason in the past. Neither of these options sound good to investors. If you lay off 10,000 people and say you were able to streamline because of AI you sound supper efficient and cool. Investors love that crap.

Conclusion

I follow the AI space, and a big chunk of what is said is total rubbish. The press loves exaggerated claims and drama because it gets attention. As a result you can easily start to think the whole thing is rubbish, but let’s not throw the baby out with the bath water.

I can see uses for it, and coming back to the title of the post, I think some of them will be for automating tasks you wouldn’t otherwise have the time or money to do.

Cheers

Tim…

PS. I understand I’m tossing around the term AI, when I’m not really talking about AI. Chill! 🙂

PPS. I was chatting to my friend again last night and asked how the release of ChatGPT 5 had affected his workflows. The response was, “a lot”, and not in a good way. Not having control over when these things drop is a problem.

Update: A friend messaged me, and I thought it was worth sharing as they are very important questions. See his questions, and my responses below. 🙂

Q: Has he breached Data Protection law by sharing the data with the LLM?
A: No. All the data is publicly available. Nothing private is being shared with the LLM. When I mentioned, “I can’t go into specifics, as it’s confidential”, I was referring to my conversation, not the data itself.

Q: Has he breached IP laws by sharing outputs or work generated by others both internal and external to the business?
A: No.

Q: Does his company know he is doing this and has gotten approval to share this data etc. externally?
A: He’s the boss, so yes he has approval to do this. 🙂

The death of independent content creators?

I recently tweeted about this article.

The Apocalypse is here for web sites as search referrals plunge

I’ve had a few conversations, both online and offline about this recently, so I thought I would write something to bring together a few of my thoughts.

I think it’s also worth mentioning this post I wrote a couple of years ago, as some of it is relevant to this discussion.

Writing Tips : Using AI to generate content

Why do people create content?

There are several reasons to create content on the internet, and everyone will have slightly different motivation, but here are some obvious reasons to do it.

Money. Some people and companies create content as a source of income. That income comes from advertising. The only way you get paid is if people visit your website or YouTube channel etc. Anything that reduces the number of people visiting your content directly affects how much money you earn.
Reputation. Some people use their content to build their reputation, or personal brand if you prefer that term. If nobody visits your content, then what is it really doing for your reputation?
Personal Reasons. I often say I write for myself. It’s how I learn stuff I need for my job. I recently put out three posts about ORDS. In all three cases I was writing something that I needed, and which I could point colleagues to. I will still do this if nobody else visits the website, but I have to admit the activation energy required to start a new post is extremely high when I think nobody will care about it. I will definitely write less often, and about fewer subjects, if I am just shouting into the wind. 🙂

Where does traffic come from?

This will vary depending on the content, but the vast majority (98%) of my traffic comes directly from Google searches. That has been pretty consistent for many years.

I don’t know how many people now search on ChatGTP directly, or via Bing, but it would seem zero change to the old workflow of searching on Google will result in an AI answer, so it hardly matters where people are going to do their searches.

What is the problem?

Having said the number of visitors to your content is a big deal, anything that reduces the people visiting your content is a problem.

In approximately 2012 Google added Knowledge Panels to the Google search output. When you searched for something, the search results might include a Knowledge Panel at the top of the page. This was typically a quote from one of the top search results that might answer your question. I would often see quotes from my own posts in them. The Knowledge Panels might work well for a request for a specific command, but I often found them a bit crappy, and ended up clicking on the link to the site anyway.

In May 2024 Google started to include an AI Overview in the search results for some questions. Rather than being a quote from a website, this was a “generated” answer. The quality of these answers varies, but they are substantially better than the old Knowledge Panels.

How has this affected my website traffic?

Even though Knowledge Panels were crappy, I did notice a drop in the visits to some of my articles. If you are just searching for a bit of syntax and the Knowledge Panel happens to show it, there is no point visiting the link. The impact was noticeable, but not catastrophic.

The timing of the release of the AI Overview, and of course other AI search options, had a more dramatic affect on my website traffic. It’s about half what it used to be. If I were doing this for cash, it would be a really big problem, because a drop in traffic like that, along with the general reduction in advertising revenues across the industry is a big deal.

Are you sure it’s because of AI?

No. It could be a coincidence. A number of things are probably at play here. Things that spring to mind are…

The AI search thing generally, as well as the Google AI Overview already mentioned.
A reduction in the number of people searching for Oracle content generally. Most people I encounter have to do a variety of tasks, not just Oracle, so the amount of times they are searching for answers about Oracle tech is reduced.
People usually search for things they are working on. The lack of the regular on-prem release of Oracle 23ai means the total number of people searching for 23ai content will be lower. Content about newer versions of the database are always less popular for a while, but the delay to the release has meant 23ai has dropped off the radar for many of us.
I’m publishing less content. Once Oracle 23c/23ai Free was released I published over 60 posts about it. I’ve got another 15+ waiting for the on-prem release. The lack of the on-prem release has left me feeling rather deflated, and I’m struggling to be bothered to write much. With less new content hitting the site, there is a tendency to get lost in the crowd. I’m sure some of the drop off in views is because of this. I don’t know how much though.
Improved documentation. It’s still far from perfect, but the Oracle documentation has improved. I’m sure some people will click a link to the docs in preference to some random website.

Even though it is likely to be a combination of these things, the timing of the inclusion of AI Overviews in Google search seems like a smoking gun to me.

So what?

It’s obviously disheartening for existing creators to see their viewership dwindle, but my ego is not really the central issue here. I was chatting over DMs to someone about this and I said this.

“The bigger issue is what happens to the next generation of content creators? With a million AI generated sites, and nobody actually hitting your site, what are the chances of them bothering?”

If there is nobody stepping in to fill our shoes, what happens when we all retire, as many of us already have?

AI feeding on itself

If there are no new people stepping up to produce new content, or even worse the new folks just generate their content, what happens? Everything I’ve read suggests that AI feeding on itself causes a downward spiral in terms of quality. You only have to look at what the YouTube algorithm promotes to know that crappy AI generated content is saturating the platform now. I’ve read numerous stories of the percentage of bot traffic on social media sites going through the roof. Where will it end up? See Dead Internet Theory.

Conclusion

I really hope I’m wrong, but I don’t think the future is positive for independent content creators. I’m sure there will always be some, but they will be increasingly lost in the noise of AI generated slop.

I’m an old man shouting at clouds, and it won’t be my problem for much longer. 🙂

Cheers

Tim…

PS. While I was posting this to social media I noticed Jon Dixon had written Why AI Doesn’t Give Credit to Bloggers? It’s worth a look.

PPS. Someone sent me a link to this video on Twitter. Very scary.

Generative Data Dev and App Dev

This is a summary of some of the points covered in Juan Loaiza’s keynote at Oracle CloudWorld 2024, with links to relevant posts to help you get started using them.

AI Application Development

The first section of the presentation discussed generative data and application development. These are some of the subjects he touched on.

These subjects were revisited in the low-code section of the presentation. There was also some mention of some of the features related to converged data architecture, discussed later.

Confidentiality, Consistency and Security

This section was more about letting the database do the heavy lifting, so each application doesn’t have to code specifically for confidentiality, consistency and security. Some of the features referenced included the following.

Virtual Private Database (VPD)
Label Security
Real Application Security
Value-Based Concurrency : Part of JSON-Relational Duality Views.
Annotations

Low-Code Development

APEX has been around for a long time, but in the recent versions we have integration of generative AI to assist in the development process.

AI Vector Search

This section of the presentation focussed on the new VECTOR data type and AI vector search.

AI Vector Search

Converged Data Architecture

This was really a mixed bag of functionality that has been added over previous database versions, and extended further by Oracle database 23ai. Here are some of the features that combine to make this converged data architecture.

Relational data : That’s what Oracle has been doing for 40+ years. 🙂
Spatial data
JSON data : Including the JSON data type and JSON-Relational Duality Views and JSON document stores.
Graph data : See SQL Property Graphs and SQL/PGQ.
XML data
Vector data

Simplifying Development

This section of the presentation was another mixed bag of new features.

As you can see, loads of things were referenced during the hour long presentation. As a result many items were just mentioned, while others had more attention.

Cheers

Tim…

AI Prompt Engineer (AI-fu). The new Google-fu?

The other day I came across the term AI Prompt Engineer. It was in the context of being the next big thing in the job market. I did a bit of Googling and sure enough there are courses about it, and a number of jobs being offered. These seem to break down into two main types of role.

A technical AI development role, involving understanding of AI and coding against various AI backends via APIs.
A person who types in stuff to get the right outcome from specific AI tools.

As mentioned, the first is a technical development role, but the second seems like AI-fu to me…

Google-fu

Here is the Wikipedia definition of Google-fu.

“Skill in using search engines (especially Google) to quickly find useful information on the Internet.”
https://en.wiktionary.org/wiki/Google-fu

After all these years I’m still surprised how weak people’s Google-fu is. Searching for information requires knowing what to include in your search criteria, as well as evaluating the results to make sure they are actually what you need.

If you don’t understand how to filter the results using targeted search terms, you may get incorrect or misleading results. If you don’t understand the subject matter, you have no way of validating correctness of the responses.

AI-fu

I’ve made up the term AI-fu. I can’t see any references to it on the internet. The closest I can find is ChatGPT-fu, which I also made up here. 🙂 I said Google-fu is about providing the appropriate inputs, and validating the outputs. I would suggest that AI-fu is very similar.

If you’ve played around with ChatGPT you will know there is a degree of “garbage in, garbage out” involved. You have to refine your inputs, either by altering the original question, or following a chain of thought (CoT) process.

How do you know you have a suitable result? Well, you have to understand the subject matter, and do some fact checking to make sure you are not generating garbage. If you don’t understand that a human hand typically has 5 digits, you can’t tell if that 7 fingered monstrosity you’ve just generated is wrong. 🙂

Thoughts

I suspect those people that have acquired good Google-fu, will also acquire good AI-fu. Those people who have proved to have consistently weak Google-fu, are unlikely to ever developer strong AI-fu.

Cheers

Tim…

PS. If you want a certificate in AI-fu, please send £500 to my PayPal… 😉

Writing Tips : Using AI to generate content

After my previous post on using ChatGPT to write code, I just wanted to say a few words about using artificial intelligence (AI) to generate content such as articles and blog posts.

I’ll list a few specific issues and give my comments on them.

Using AI for inspiration

I watched a video by Marques Brownlee discussing the Bing integration of ChatGPT (here), and one of the things he mentioned was using AI for inspiration. Not expecting AI to create a whole piece of work for you, but using a chat with the AI to come up with ideas.

It reminded me of one of my first attempts at using ChatGPT, which was this. 🙂

Write a tweet about an idea for a new episode of star trek

If I were really trying to write something original, I might use this as the inspiration to create my own piece of work.

It should be noted, when I tweeted about this someone replied to say it was similar to the plot of a film they had seen, so we need to be careful the AI is not just stealing someone else’s idea. 🙂

I have no problem with people using AI as part of the generation of ideas. Just be careful that the ideas you get are vaguely original. 🙂

Turning bullet points into prose

One of my friends works for a company that ships physical products. The company has a paper catalogue, as well as an online store. He gets product details from the manufacturers and needs to pretty them up for use in their catalogue and website. He told me he is now using ChatGPT to do this.

To give you an idea of what he is doing I copied some text of Amazon and asked ChatGPT to make it a bit nicer.

In this case we aren’t expecting the AI to get facts from the internet. We are providing the base information and using the AI as a writing aid.

This is another use case I think it totally fine. It’s merely a tool that saves you a bit of time. People already use tools like Grammarly to help with spelling and grammar. This just seems like a logical next step to me.

It makes mistakes

The AI doesn’t know anything about the content it is generating, so it can’t fact check itself. Here’s another example of something I Tweeted out. I asked ChatGPT if I should use leading or trailing commas when writing SQL.

When writing SQL, should I use leading or trailing commas?

It came back with a nice answer saying it is a personal preference, and gave an example of the two styles. The slight problem was the examples demonstrate the opposite of what they are meant to. 🙂

A human can pick that up, correct it and we will get something that seems reasonable, but it proves the point that we can’t blindly accept the output of AI content generation. We need to proof read and fact check it. This can be difficult if it doesn’t cite the sources used during the generation.

Sources and citations

Currently ChatGPT is based on a 2021 data set. When we use it we get no citations for the sources of information used during the generation process. This causes a number of problems.

It makes it hard to fact check the information.
It is impossible to properly cite the sources.
We can’t read the source material to check the AI’s interpretation is correct.
We can’t make a judgement on how much we trust the source material. Not all sources are reputable.
We can’t check to see if the AI has copied large pieces of text, leaving us open to copyright infringement. The generated text is supposedly unique, but can we be certain of that?

The Bing integration of ChatGPT does live searches of the internet, and includes citations for the information sources used, which solves many of these problems.

Copyright

AI content generation is still fairly new, but we are already seeing a number of issues related to copyright.

There are numerous stories about AI art generation infringing the copyright of artists, with many calling for their work to be opted out of the training data sets for AI, or to be paid for their inclusion. There is a line between inspiration and theft, and many believe AI art generation has crossed it. It’s possible this line has already been crossed in AI text generation also.

There is also the other side of copyright to consider. If you produce a piece of work using AI, it’s possible you can’t copyright that piece of work, since copyright applies to work created by a human. See the discussion here.

Your AI-generated digital artwork may not be protected by US copyright

You can argue about the relative amounts of work performed by the AI and the human, but it seems that for 100% AI generation you are skating on thin ice. Of course, things can change as AI becomes more pervasive.

Who is paying for the source material to be created?

Like it or not, the internet is funded by ad revenue. Many people rely on views on their website to pay for their content creation. Anything that stops people actually visiting their site impacts on their income, and will ultimately see some people drop out of the content creation space.

When Google started including suggested answers in their Google search results, this already meant some people no longer needed to click on the source links. ChatGTP takes that one step further. If it becomes common place for people to search on Bing (or any other AI backed search engine), and use the AI generated result presented, rather than visiting the source sites, this will have a massive impact on the original content creators. The combination of this and ad blockers may mean the end for some content creators.

If there is no original content on the internet, there is nothing for AI to use as source material, and we could hit a brick wall. Of course there will always be content on the internet, but I think you can see where I’m going with this.

So just like the copyright infringement issues with AI art, are we going to see problems with the source material used for AI text generation? Will search engines have to start paying people for the source material they use? We’ve already seen this type of issue with search engines reporting news stories.

The morality of writing whole posts with AI

This is where things start to get a bit tricky, and this is more about morality and ethics, rather than content.

Let’s say your job is to write content. Someone is paying you to spend 40 hours a week writing that content, and instead you spend a few minutes generating content with AI, and use the rest of the time to watch Netflix. You can argue you are delivering what is asked of you and making intelligent use of automation, or that you are stealing from the company because you are being paid for a job you are not doing. I’m guessing different people will have a different take on this from a moral perspective.

Continuing with the theme of being paid to write, what if the company you are working for is expecting to have copyright control over the work you produce? If it can be determined it is AI generated, they can’t copyright it, and that work can be republished with no comeback. I can see that making you rather unpopular.

Education establishments already use software to check for plagiarism. The use of AI is already making educational establishments nervous. OpenAI, the creators of ChatGPT, have already created an ~~AI Text Classifier~~ (discontinued) to identify text that has been generated by AI. I can only imagine these types of utilities will become common place, and you could find yourself in hot water if you are passing off AI generated work as your own. You will certainly lose your qualifications for doing it.

Many people use their blogs as an indication of their expertise. They are presenting themselves as well versed in a subject, which can then lead to other opportunities, such as job offers, invitations to conferences and inclusion in technology evangelism programs. If it becomes apparent the content is not your own work, it would seem logical that your professional reputation would be trashed, and you would lose some or all of the benefits you have gained.

Conclusion

There is no right and wrong answer here, but in my opinion it’s important we use AI as a tool, and not a mechanism to cheat. Where we draw the line will depend on the individual, and the nature of the work being done. Also, it’s possible that line in the sand will change over time…

Check out the rest of the series here.

Cheers

Tim…