In statistics and control theory, Kalman filtering (also known as linear quadratic estimation) is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, to produce estimates of unknown variables that tend to be more accurate than those based on a single measurement, by estimating a joint probability distribution over the variables for each time-step. The filter is constructed as a mean squared error minimiser, but an alternative derivation of the filter is also provided showing how the filter relates to maximum likelihood statistics. The filter is named after Rudolf E. Kálmán. Kalman filtering has numerous technological applications. A common application is for guidance, navigation, and control of vehicles, particularly aircraft, spacecraft and ships positioned dynamically. Furthermore, Kalman filtering is much applied in time series analysis tasks such as signal processing and econometrics. Kalman filtering is also important for robotic motion planning and control, and can be used for trajectory optimization. Kalman filtering also works for modeling the central nervous system's control of movement. Due to the time delay between issuing motor commands and receiving sensory feedback, the use of Kalman filters provides a realistic model for making estimates of the current state of a motor system and issuing updated commands. The algorithm works via a two-phase process: a prediction phase and an update phase. In the prediction phase, the Kalman filter produces estimates of the current state variables, including their uncertainties. Once the outcome of the next measurement (necessarily corrupted with some error, including random noise) is observed, these estimates are updated using a weighted average, with more weight given to estimates with greater certainty. The algorithm is recursive. It can operate in real time, using only the present input measurements and the state calculated previously and its uncertainty matrix; no additional past information is required. Optimality of Kalman filtering assumes that errors have a normal (Gaussian) distribution. In the words of Rudolf E. Kálmán, "The following assumptions are made about random processes: Physical random phenomena may be thought of as due to primary random sources exciting dynamic systems. The primary sources are assumed to be independent gaussian random processes with zero mean; the dynamic systems will be linear." Regardless of Gaussianity, however, if the process and measurement covariances are known, then the Kalman filter is the best possible linear estimator in the minimum mean-square-error sense, although there may be better nonlinear estimators. It is a common misconception (perpetuated in the literature) that the Kalman filter cannot be rigorously applied unless all noise processes are assumed to be Gaussian. Extensions and generalizations of the method have also been developed, such as the extended Kalman filter and the unscented Kalman filter which work on nonlinear systems. The basis is a hidden Markov model such that the state space of the latent variables is continuous and all latent and observed variables have Gaussian distributions. Kalman filtering has been used successfully in multi-sensor fusion, and distributed sensor networks to develop distributed or consensus Kalman filtering. == History == The filtering method is named for Hungarian émigré Rudolf E. Kálmán, although Thorvald Nicolai Thiele and Peter Swerling developed a similar algorithm earlier. Richard S. Bucy of the Johns Hopkins Applied Physics Laboratory contributed to the theory, causing it to be known sometimes as Kalman–Bucy filtering. Kalman was inspired to derive the Kalman filter by applying state variables to the Wiener filtering problem. Stanley F. Schmidt is generally credited with developing the first implementation of a Kalman filter. He realized that the filter could be divided into two distinct parts, with one part for time periods between sensor outputs and another part for incorporating measurements. It was during a visit by Kálmán to the NASA Ames Research Center that Schmidt saw the applicability of Kálmán's ideas to the nonlinear problem of trajectory estimation for the Apollo program resulting in its incorporation in the Apollo navigation computer. This digital filter is sometimes termed the Stratonovich–Kalman–Bucy filter because it is a special case of a more general, nonlinear filter developed by the Soviet mathematician Ruslan Stratonovich. In fact, some of the special case linear filter's equations appeared in papers by Stratonovich that were published before the summer of 1961, when Kalman met with Stratonovich during a conference in Moscow. This Kalman filtering was first described and developed partially in technical papers by Swerling (1958), Kalman (1960) and Kalman and Bucy (1961). The Apollo computer used 2k of magnetic core RAM and 36k wire rope [...]. The CPU was built from ICs [...]. Clock speed was under 100 kHz [...]. The fact that the MIT engineers were able to pack such good software (one of the very first applications of the Kalman filter) into such a tiny computer is truly remarkable. Kalman filters have been vital in the implementation of the navigation systems of U.S. Navy nuclear ballistic missile submarines, and in the guidance and navigation systems of cruise missiles such as the U.S. Navy's Tomahawk missile and the U.S. Air Force's Air Launched Cruise Missile. They are also used in the guidance and navigation systems of reusable launch vehicles and the attitude control and navigation systems of spacecraft which dock at the International Space Station. == Overview of the calculation == Kalman filtering uses a system's dynamic model (e.g., physical laws of motion), known control inputs to that system, and multiple sequential measurements (such as from sensors) to form an estimate of the system's varying quantities (its state) that is better than the estimate obtained by using only one measurement alone. As such, it is a common sensor fusion and data fusion algorithm. Noisy sensor data, approximations in the equations that describe the system evolution, and external factors that are not accounted for, all limit how well it is possible to determine the system's state. The Kalman filter deals effectively with the uncertainty due to noisy sensor data and, to some extent, with random external factors. The Kalman filter produces an estimate of the state of the system as an average of the system's predicted state and of the new measurement using a weighted average. The purpose of the weights is that values with better (i.e., smaller) estimated uncertainty are "trusted" more. The weights are calculated from the covariance, a measure of the estimated uncertainty of the prediction of the system's state. The result of the weighted average is a new state estimate that lies between the predicted and measured state, and has a better estimated uncertainty than either alone. This process is repeated at every time step, with the new estimate and its covariance informing the prediction used in the following iteration. This means that Kalman filter works recursively and requires only the last "best guess", rather than the entire history, of a system's state to calculate a new state. The measurements' certainty-grading and current-state estimate are important considerations. It is common to discuss the filter's response in terms of the Kalman filter's gain. The Kalman gain is the weight given to the measurements and current-state estimate, and can be "tuned" to achieve a particular performance. With a high gain, the filter places more weight on the most recent measurements, and thus conforms to them more responsively. With a low gain, the filter conforms to the model predictions more closely. At the extremes, a high gain (close to one) will result in a more jumpy estimated trajectory, while a low gain (close to zero) will smooth out noise but decrease the responsiveness. When performing the actual calculations for the filter (as discussed below), the state estimate and covariances are coded into matrices because of the multiple dimensions involved in a single set of calculations. This allows for a representation of linear relationships between different state variables (such as position, velocity, and acceleration) in any of the transition models or covariances. == Example application == As an example application, consider the problem of determining the precise location of a truck. The truck can be equipped with a GPS unit that provides an estimate of the position within a few meters. The GPS estimate is likely to be noisy; readings 'jump around' rapidly, though remaining within a few meters of the real position. In addition, since the truck is expected to follow the laws of physics, its position can also be estimated by integrating its velocity over time, determined by keeping track of wheel revolutions and the
SIP (software)
SIP is an open source software tool used to connect computer programs or libraries written in C or C++ with the scripting language Python. It is an alternative to SWIG. SIP was originally developed in 1998 for PyQt — the Python bindings for the Qt GUI toolkit — but is suitable for generating bindings for any C or C++ library. == Concept == SIP takes a set of specification (.sip) files describing the API and generates the required C++ code. This is then compiled to produce the Python extension modules. A .sip file is essentially the class header file with some things removed (because SIP does not include a full C++ parser) and some things added (because C++ does not always provide enough information about how the API works). For PyQt v4 I use an internal tool (written using PyQt of course) called metasip. This is sort of an IDE for SIP. It uses GCC-XML to parse the latest header files and saves the relevant data, as XML, in a metasip project. metasip then does the equivalent of a diff against the previous version of the API and flags up any changes that need to be looked at. Those changes are then made through the GUI and ticked off the TODO list. Generating the .sip files is just a button click. In my subversion repository, PyQt v4 is basically just a 20M XML file. Updating PyQt v4 for a minor release of Qt v4 is about half an hours work. In terms of how the generated code works then I don't think it's very different from how any other bindings generator works. Python has a very good C API for writing extension modules - it's one of the reasons why so many 3rd party tools have Python bindings. For every C++ class, the SIP generated code creates a corresponding Python class implemented in C. == Notable applications that use SIP == PyQt, a python port of the application framework and widget toolkit Qt QGIS, a free and open-source cross-platform desktop geographic information system (GIS) QtiPlot, a computer program to analyze and visualize scientific data calibre (software), a free and open-source cross-platform e-book manager Veusz, a free and open-source cross-platform program to visualize scientific data
Ebert test
The Ebert test gauges whether a computer-based synthesized voice can tell a joke with sufficient skill to cause people to laugh. It was proposed by film critic Roger Ebert at the 2011 TED conference as a challenge to software developers to have a computerized voice master the inflections, delivery, timing, and intonations of human speech. The test is similar to the Turing test proposed by Alan Turing in 1950 as a way to gauge a computer's ability to exhibit intelligent behavior by generating performance indistinguishable from a human being. If the computer can successfully tell a joke, and do the timing and delivery as well as Henny Youngman, then that's the voice I want. Ebert lost his voice in 2006 after undergoing surgery to treat thyroid cancer. He employed a Scottish company called CereProc, which custom-tailors text-to-speech software for voiceless customers who record their voices at length before losing them, and mined tapes and DVD commentaries featuring Ebert to create a voice that sounded more like his own voice. He first publicly used the voice they devised for him in his March 2, 2010, appearance on The Oprah Winfrey Show. The audience of Ebert's 2011 TED talk about joke delivery by synthesized voices erupted with laughter when a synthesized voice delivered the following joke: "A guy goes into a psychiatrist. The psychiatrist says, 'You’re crazy.' The guy says, 'I want a second opinion.' The psychiatrist says, 'All right, you’re ugly, too.'"
Ethics of artificial intelligence
The ethics of artificial intelligence covers a broad range of topics within AI that are considered to have particular ethical stakes. This includes algorithmic biases, fairness, accountability, transparency, privacy, and regulation, particularly where systems influence or automate human decision-making. It also covers various emerging or potential future challenges such as machine ethics (how to make machines that behave ethically), lethal autonomous weapon systems, arms race dynamics, AI safety and alignment, technological unemployment, AI-enabled misinformation, how to treat certain AI systems if they have a moral status (AI welfare and rights), artificial superintelligence and existential risks. Some application areas may also have particularly important ethical implications, like healthcare, education, criminal justice, or the military. == Machine ethics == Machine ethics (or machine morality) is the field of research concerned with designing Artificial Moral Agents (AMAs), robots or artificially intelligent computers that behave morally or as though moral. To account for the nature of these agents, it has been suggested to consider certain philosophical ideas, like the standard characterizations of agency, rational agency, moral agency, and artificial agency, which are related to the concept of AMAs. There are discussions on creating tests to see if an AI is capable of making ethical decisions. Alan Winfield concludes that the Turing test is flawed and the requirement for an AI to pass the test is too low. A proposed alternative test is one called the Ethical Turing Test, which would improve on the current test by having multiple judges decide if the AI's decision is ethical or unethical. Neuromorphic AI could be one way to create morally capable robots, as it aims to process information similarly to humans, nonlinearly and with millions of interconnected artificial neurons. Similarly, whole-brain emulation (scanning a brain and simulating it on digital hardware) could also in principle lead to human-like robots, thus capable of moral actions. And large language models are capable of approximating human moral judgments. Inevitably, this raises the question of the environment in which such robots would learn about the world and whose morality they would inherit – or if they end up developing human 'weaknesses' as well: selfishness, pro-survival attitudes, inconsistency, scale insensitivity, etc. In Moral Machines: Teaching Robots Right from Wrong, Wendell Wallach and Colin Allen conclude that attempts to teach robots right from wrong will likely advance understanding of human ethics by motivating humans to address gaps in modern normative theory and by providing a platform for experimental investigation. As one example, it has introduced normative ethicists to the controversial issue of which specific learning algorithms to use in machines. For simple decisions, Nick Bostrom and Eliezer Yudkowsky have argued that decision trees (such as ID3) are more transparent than neural networks and genetic algorithms, while Chris Santos-Lang argued in favor of machine learning on the grounds that the norms of any age must be allowed to change and that natural failure to fully satisfy these particular norms has been essential in making humans less vulnerable to criminal "hackers". Some researchers frame machine ethics as part of the broader AI control or value alignment problem: the difficulty of ensuring that increasingly capable systems pursue objectives that remain compatible with human values and oversight. Stuart Russell has argued that beneficial systems should be designed to (1) aim at realizing human preferences, (2) remain uncertain about what those preferences are, and (3) learn about them from human behaviour and feedback, rather than optimizing a fixed, fully specified goal. Some authors argue that apparent compliance with human values may reflect optimization for evaluation contexts rather than stable internal norms, complicating the assessment of alignment in advanced language models. == Challenges == === Algorithmic biases === AI has become increasingly inherent in facial and voice recognition systems. These systems may be vulnerable to biases and errors introduced by their human creators. Notably, the data used to train them can have biases. According to Allison Powell, associate professor at LSE and director of the Data and Society programme, data collection is never neutral and always involves storytelling. She argues that the dominant narrative is that governing with technology is inherently better, faster and cheaper, but proposes instead to make data expensive, and to use it both minimally and valuably, with the cost of its creation factored in. Friedman and Nissenbaum identify three categories of bias in computer systems: existing bias, technical bias, and emergent bias. In natural language processing, problems can arise from the text corpus—the source material the algorithm uses to learn about the relationships between different words. Large companies such as IBM, Google, etc. that provide significant funding for research and development have made efforts to research and address these biases. One potential solution is to create documentation for the data used to train AI systems. Process mining can be an important tool for organizations to achieve compliance with proposed AI regulations by identifying errors, monitoring processes, identifying potential root causes for improper execution, and other functions. However, there are also limitations to the current landscape of fairness in AI, due to the intrinsic ambiguities in the concept of discrimination, both at the philosophical and legal level. ==== Racial and gender biases ==== Bias can be introduced through historical data used to train AI systems. For instance, Amazon terminated their use of AI hiring and recruitment because the algorithm favored male candidates over female ones. This was because Amazon's system was trained with data collected over a 10-year period that included mostly male candidates. The algorithms learned the biased pattern from the historical data, and generated predictions where these types of candidates were most likely to succeed in getting the job. Therefore, the recruitment decisions made by the AI system turned out to be biased against female and minority candidates. The performance of facial recognition and computer vision models may vary based on race and gender. Facial recognition algorithms made by Microsoft, IBM and Face++ all performed significantly worse on darker-skinned women. Facial recognition was shown to be biased against those with darker skin tones. AI systems may be less accurate for black people, as was the case in the development of an AI-based pulse oximeter that overestimated blood oxygen levels in patients with darker skin, causing issues with their hypoxia treatment. In 2015, controversy erupted after a Black couple were labeled "Gorillas" by Google Photos. Oftentimes the systems are able to easily detect the faces of white people while being unable to register the faces of people who are black. This has led to the ban of police usage of AI materials or software in some U.S. states. The reason for these biases is that AI pulls information from across the internet to influence its responses in each situation. For example, if a facial recognition system was only tested on people who were white, it would make it much harder for it to interpret the facial structure and tones of other races and ethnicities. Biases often stem from the training data rather than the algorithm itself, notably when the data represents past human decisions. A 2020 study that reviewed voice recognition systems from Amazon, Apple, Google, IBM, and Microsoft found that they have higher error rates when transcribing black people's voices than white people's. Injustice in the use of AI is much harder to eliminate within healthcare systems, as oftentimes diseases and conditions can affect different races and genders differently. This can lead to confusion as the AI may be making decisions based on statistics showing that one patient is more likely to have problems due to their gender or race. This can be perceived as a bias because each patient is a different case, and AI is making decisions based on what it is programmed to group that individual into. This leads to a discussion about what should be considered a biased decision in the distribution of treatment. While it is known that there are differences in how diseases and injuries affect different genders and races, there is a discussion on whether it is fairer to incorporate this into healthcare treatments, or to examine each patient without this knowledge. In modern society there are certain tests for diseases, such as breast cancer, that are recommended to certain groups of people over others because they are more likely to contract the disease in question. If AI implements these statistics
Pinakes
The Pinakes (Ancient Greek: Πίνακες 'tables', plural of πίναξ pinax) is a lost bibliographic work composed by Callimachus (310/305–240 BCE) that is popularly considered to be the first library catalog in the West; its contents were based upon the holdings of the Library of Alexandria during Callimachus's tenure there during the third century BCE. == History == The Library of Alexandria had been founded by Ptolemy I Soter about 306 BCE. The first recorded librarian was Zenodotus of Ephesus. During Zenodotus' tenure, Callimachus, who was never the head librarian, compiled many catalogues/lists, each called Pinakes. His most famous one listed authors and their works; thus he became the first known bibliographer and the scholar who organized the library by authors and subjects about 245 BCE. His work was 120 volumes long. Apollonius of Rhodes was the successor to Zenodotus. Eratosthenes of Cyrene succeeded Apollonius in 235 BCE and compiled his tetagmenos epi teis megaleis bibliothekeis, the 'scheme of the great bookshelves'. In 195 BCE Aristophanes of Byzantium, Eratosthenes' successor, was the librarian and updated the Pinakes, although it is also possible that his work was not a supplement of Callimachus' Pinakes themselves, but an independent polemic against, or commentary upon, their contents. == Description == The collection at the Library of Alexandria contained nearly 500,000 papyrus scrolls, which were grouped together by subject matter and stored in bins. Each bin carried a label with painted tablets hung above the stored papyri. Pinakes was named after these tablets and are a set of index lists. The bins gave bibliographical information for every roll. A typical entry started with a title and also provided the author's name, birthplace, father's name, any teachers trained under, and educational background. It contained a brief biography of the author and a list of the author's publications. The entry had the first line of the work, a summary of its contents, the name of the author, and information about the origin of the roll, as well as any doubts about the genuineness of the ascription. Callimachus' system divided works into six genres of poetry and five sections of prose: rhetoric, law, epic, tragedy, comedy, lyric poetry, history, medicine, mathematics, natural science, and miscellanies. Each category was alphabetized by author. Callimachus composed two other works that were referred as pinakes and were probably somewhat similar in format to the Pinakes (of which they "may or may not be subsections"), but were concerned with individual topics. These are listed by the Suda as: A Chronological Pinax and Description of Didaskaloi from the Beginning and Pinax of the Vocabulary and Treatises of Democritus. == Later bibliographic pinakes == The term pinax was used for bibliographic catalogs beyond Callimachus. For example, Ptolemy-el-Garib's catalog of Aristotle's writings comes to us with the title Pinax (catalog) of Aristotle's writings. == Legacy == The Pinakes proved indispensable to librarians for centuries, and they became a model for organizing knowledge throughout the Mediterranean. Their later influence can be traced to medieval times, even to the Arabic counterpart of the tenth century: Ibn al-Nadim's Al-Fihrist ("Index"). Local variations for cataloging and library classification continued through the late 19th century, when Anthony Panizzi and Melvil Dewey paved the way for more shared and standardized approaches.
Client-side persistent data
Client-side persistent data or CSPD is a term used in computing for storing data required by web applications to complete internet tasks on the client-side as needed rather than exclusively on the server. As a framework it is one solution to the needs of Occasionally connected computing or OCC. A major challenge for HTTP as a stateless protocol has been asynchronous tasks. The AJAX pattern using XMLHttpRequest was first introduced by Microsoft in the context of the Outlook e-mail product. The first CSPD were the 'cookies' introduced by the Netscape Navigator. ActiveX components which have entries in the Windows registry can also be viewed as a form of client-side persistence.
Mistral AI
Mistral AI SAS (French: [mistʁal]) is a French artificial intelligence (AI) company, headquartered in Paris. Founded in 2023, it has open-weight large language models (LLMs), with both open-source and proprietary AI models. As of 2025 the company has a valuation of more than US$14 billion. == Namesake == The company is named after the mistral, a powerful, cold wind in southern France, a term which originates from the Occitan language. == History == Mistral AI was established in April 2023 by three French AI researchers, Arthur Mensch, Guillaume Lample and Timothée Lacroix. Mensch, an expert in advanced AI systems, is a former employee of Google DeepMind; Lample and Lacroix, meanwhile, are large-scale AI models specialists who had worked for Meta Platforms. The trio originally met during their studies at École Polytechnique. == Company operation == === Funding === In June 2023, the start-up carried out a first fundraising of €105 million ($117 million) with investors including the American fund Lightspeed Venture Partners, Eric Schmidt, Xavier Niel and JCDecaux. The valuation was then estimated by the Financial Times at €240 million ($267 million). On 10 December 2023, Mistral AI announced that it had raised €385 million ($428 million) as part of its second fundraising. This round of financing involves the Californian fund Andreessen Horowitz, BNP Paribas and the software publisher Salesforce. It was valued at over €2 billion. On 26 February 2024, Microsoft announced an investment of $16 million in Mistral AI. On 16 April 2024, reporting revealed that Mistral was in talks to raise €500 million, a deal that would more than double its current valuation to at least €5 billion. In June 2024, Mistral AI secured a €600 million ($645 million) funding round, increasing its valuation to €5.8 billion ($6.2 billion). Based on valuation, as of June 2024, the company was ranked fourth globally in the AI industry, and first outside the San Francisco Bay Area. In April 2025, Mistral AI announced a €100 million partnership with the shipping company CMA CGM. In August 2025, the Financial Times reported that Mistral was in talks to raise $1 billion at a $10 billion valuation. In September 2025, Bloomberg announced that Mistral AI has secured a €2 billion investment valuing it at €12 billion ($14 billion). This comes after $1.5 billion investment from Dutch company ASML, which owns 11% of Mistral. In February 2026, Mistral acquired Koyeb, a Paris-based AI startup. Later that month, Mistral AI announced a multi-year strategic partnership with Accenture to help enterprises deploy sovereign AI solutions at scale. In March 2026 Mistral raised $830 million in order to build new datacenters near Paris and in Sweden. == Services == On 19 November, 2024, the company announced updates for Le Chat (pronounced /lə ʃa/ in French, like the French word for "cat"). It added the ability to create images, using Black Forest Labs' Flux Pro model. On 6 February 2025, Mistral AI released Le Chat on iOS and Android mobile devices. Mistral AI also introduced a Pro subscription tier, priced at $14.99 per month, which provides access to more advanced models, unlimited messaging, and web browsing. At the end of May 2026, Le Chat was renamed Vibe, and new features were introduced at the same time. == Models == The following table lists the main model versions of Mistral, describing the significant changes included with each version: === Mistral 7B === Mistral AI claimed in the Mistral 7B release blog post that the model outperforms LLaMA 2 13B on all benchmarks tested, and is on par with LLaMA 34B on many benchmarks tested, despite having only 7 billion parameters, a small size compared to its competitors. === Mixtral 8x7B === Mistral AI claimed in 2023 that its model beat both LLaMA 70B, and GPT-3.5 in most benchmarks. In March 2024, research conducted by Patronus AI comparing performance of LLMs on a 100-question test with prompts to generate text from books protected under U.S. copyright law found that OpenAI's GPT-4, Mixtral, Meta AI's LLaMA-2, and Anthropic's Claude 2 generated copyrighted text verbatim in 44%, 22%, 10%, and 8% of responses respectively. === Mistral Small 3.1 === On 17 March 2025, Mistral released Mistral Small 3.1 as a smaller, more efficient model. === Mistral Medium 3 === On 7 May 2025, Mistral AI released Mistral Medium 3. === Magistral Small and Magistral Medium === On 10 June 2025, Mistral AI released their first AI reasoning models: Magistral Small (open-source), and Magistral Medium, models which are purported to have chain-of-thought capabilities. === Mistral Large 3 and Ministral 3 === On 2 December 2025, Mistral AI released Mistral Large 3, a sparse, mixture-of-experts model with 41 billion active parameters and 675 billion total parameters, and Ministral 3, three small, dense models with 3 billion, 7 billion and 14 billion parameters. === Devstral 2 and Devstral Small 2 === On 10 December 2025, Mistral AI released Devstral 2 and Devstral Small 2. Devstral Small 2, a 24B parameter model is claimed to achieve better performance at coding than Qwen 3 Coder Flash model which is a 30B parameter model.