VALICY – Virtual Validation System

VALICY – a virtual validation system for AI/ML and complex software applications

What is it?

VALICY is an AI based python framework that enables virtual validation for AI/ML (and also complex software) applications. It runs a multitude of AI instances to intelligently sample the testing space depending on several continuous or state-based input parameters of a classification problem. It is possible to predict several targets at the same time.  

VALICY consists of: 

  • a relational data base (MySQL) as a data hub,   

  • a python framework to test black box applications that runs in docker containers on a dedicated server,  

  • a REST-API to communicate with VALICY, to provide test feedback and to get new test proposals for a black box application (AI of a customer) 

  • a front end for inspection of results, download of single runs for live testing and development of the certainty of result dimension(s)  

Contact

Janis Lapins
Spicetech

Send email

More Information:
White Paper on Artificial Intelligence: a European approach to excellence and trust

Why is it necessary?

The European AI Act is a proposed set of regulations for artificial intelligence (AI) systems within the European Union (EU) that was introduced in April 2021. While it is not yet law, it is currently being reviewed by the European Parliament and the Council of the European Union. Some of the major agreements require a systematic virtual validation approach like VALICY applies. This is especially important for the section of the risk-based approach: The AI Act would categorize AI systems based on their potential risk level, with higher-risk systems facing stricter requirements. As VALICY is able to quantify the remaining uncertainty in the application range, this methodology will gain more and more interest, especially in safety-critical applications. 

How does it work?

When an AI application is tested with VALICY, first, a job specification must be provided. This specifies the number of input dimensions together with corresponding minimum and maximum values, the number of target dimensions, a threshold, its nature (upper or lower), and a certainty target for each target dimension. All these dimensions could be provided in a scaled or normalized manner and named arbitrarily to disguise what is tested. 

When a job is started first the regular grid is sampled, and the values are exported to the API to be validated by the application under test.