"Although difficult (because most LLM datasets exclude security data), all the above seem like LLMs can theoretically automate them because you can find or easily create datasets containing the information to train the LLM." Do they exclude the data or filter it from the public? If filtered, could an argument be made for a trusted partner of LLM owners to gain access for the very specific reason of "protecting the public"?

