Metadata

Metadata

Metadata (or metainformation) is data (or information) that defines and describes the characteristics of other data. It often helps to describe, explain, locate, or otherwise make data easier to retrieve, use, or manage. For example, the title, author, and publication date of a book are metadata about the book. But, while a data asset is finite, its metadata is infinite. As such, efforts to define, classify types, or structure metadata are expressed as examples in the context of its use. The term "metadata" has a history dating to the 1960s where it occurred in computer science and in popular culture. Different types of metadata serve different functions. For example, descriptive metadata for a document might include the author, creation date, file size and keywords. Metadata has various purposes. It can help users find relevant information and discover resources. It can also help organize electronic resources, provide digital identification, and archive and preserve resources. Metadata allows users to access resources by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information". Metadata of telecommunication activities including Internet traffic is very widely collected by various national governmental organizations. This data is used for the purposes of traffic analysis and can be used for mass surveillance. Unique metadata standards exist for different disciplines (e.g., museum collections, digital audio files, websites, etc.). Describing the contents and context of data or data files increases its usefulness. For example, a web page may include metadata specifying what software language the page is written in (e.g., HTML), what tools were used to create it, what subjects the page is about, and where to find more information about the subject. This metadata can automatically improve the reader's experience and make it easier for users to find the web page online. A CD may include metadata providing information about the musicians, singers, and songwriters whose work appears on the disc. In many countries, government organizations routinely store metadata about emails, telephone calls, web pages, video traffic, IP connections, and cell phone locations. == Types == There are many distinct types of metadata, including: Descriptive metadata – the descriptive information about a resource. It is used for discovery and identification. It includes elements such as title, abstract, author, and keywords. Structural metadata – metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. Administrative metadata – the information to help manage a resource, like resource type, and permissions, and when and how it was created. Reference metadata – the information about the contents and quality of statistical data. Statistical metadata – also called process data, may describe processes that collect, process, or produce statistical data. Legal metadata – provides information about the creator, copyright holder, and public licensing, if provided. Metadata is not strictly bound to one of these categories, as it can describe a piece of data in many other ways. While the metadata application is manifold, covering a large variety of fields, there are specialized and well-accepted models to specify types of metadata. Bretherton & Singley (1994) distinguish between two distinct classes: structural/control metadata and guide metadata. Structural metadata describes the structure of database objects such as tables, columns, keys and indexes. Guide metadata helps humans find specific items and is usually expressed as a set of keywords in a natural language. According to Ralph Kimball, metadata can be divided into three categories: technical metadata (or internal metadata), business metadata (or external metadata), and process metadata. Dan Linstedt, creator of the data vault methodology, says business metadata "...provide[s] definition of the functionality, definition of the data, definition of the elements, and definition of how the data is used within business...business metadata includes business requirements, time-lines, business metrics, business process flows, and business terminology." Business metadata is important because it can greatly facilitate the usefulness of the data to business people. A simple example of business metadata is a glossary entry. Hover functionality in an application or web form can enable a glossary definition to be shown when cursor is on a field or term. Other examples of business metadata include annotation ability within applications. For example, a business user may be viewing a business intelligence (BI) report and notice a trend in the data. The user may have background knowledge as to why this trend occurs. Some business intelligence tools enable the user to create an annotation within the report that explains the trend. Such an annotation can enhance other users' understanding of the data. This example is especially powerful because it is created by a business user for the use of other business people. NISO distinguishes three types of metadata: descriptive, structural, and administrative. Descriptive metadata is typically used for discovery and identification, as information to search and locate an object, such as title, authors, subjects, keywords, and publisher. Structural metadata describes how the components of an object are organized. An example of structural metadata would be how pages are ordered to form chapters of a book. Finally, administrative metadata gives information to help manage the source. Administrative metadata refers to the technical information, such as file type, or when and how the file was created. Two sub-types of administrative metadata are rights management metadata and preservation metadata. Rights management metadata explains intellectual property rights, while preservation metadata contains information to preserve and save a resource. Statistical data repositories have their own requirements for metadata in order to describe not only the source and quality of the data but also what statistical processes were used to create the data, which is of particular importance to the statistical community in order to both validate and improve the process of statistical data production. An additional type of metadata beginning to be more developed is accessibility metadata. Accessibility metadata is not a new concept to libraries; however, advances in universal design have raised its profile. Projects like Cloud4All and GPII identified the lack of common terminologies and models to describe the needs and preferences of users and information that fits those needs as a major gap in providing universal access solutions. Those types of information are accessibility metadata. The Schema.org website has incorporated several accessibility properties based on IMS Global Access for All Information Model Data Element Specification. While the efforts to describe and standardize the varied accessibility needs of information seekers are beginning to become more robust, their adoption into established metadata schemas has not been as developed. For example, while Dublin Core (DC)'s "audience" and MARC 21's "reading level" could be used to identify resources suitable for users with dyslexia and DC's "format" could be used to identify resources available in braille, audio, or large print formats, there is more work to be done. == History == Metadata was traditionally used in the card catalogs of libraries until the 1980s when libraries converted their catalog data to digital databases. In the 2000s, as data and information were increasingly stored digitally, this digital data was described using metadata standards. An early description of "meta data" for computer systems was written by David Griffel and Stuart McIntosh at the MIT Center for International Studies in 1967: "In summary then, we have statements in an object language about subject descriptions of data and token codes for the data. We also have statements in a meta language describing the data relationships and transformations, and ought/is relations between norm and data." == Definition == Metadata means "data about data". Metadata is defined as the data providing information about one or more aspects of the data; it is used to summarize basic information about data that can make tracking and working with specific data easier. Some examples include: Means of creation of the data Source of the data Time and date of creation Creator or author of the data Location on a computer network where the data was created Standards used Data quality For example, a digital image may include metadata that describes the size of the image, its color depth, resolution,

Final Cut Express

Final Cut Express was a video editing software suite created by Apple Inc. It was the consumer version of Final Cut Pro and was designed for advanced editing of digital video as well as high-definition video, which was used by many amateur and professional videographers. Final Cut Express was considered a step above iMovie in terms of capabilities, but a step underneath Final Cut Pro and its suite of applications. As of June 21, 2011, Final Cut Express was discontinued in favor of Final Cut Pro X. == History == Final Cut Express 1.0, based on Final Cut Pro 3, was released at Macworld Conference and Expo in San Francisco in 2003. The second version, based on Final Cut Pro 4, was released at Macworld San Francisco in 2004. The third version, capable of editing high definition video, was also announced at Macworld San Francisco a year later, and was released as Final Cut Express HD in February 2005. It was based on Final Cut Pro HD (version 4.5) and included LiveType 1.2 and Soundtrack 1.2. Final Cut Express version 3.5 was released with little fanfare in May 2006 as a Universal Binary. In addition to improving real-time rendering with Dynamic RT, version 3.5 upgraded LiveType to version 2.0 and Soundtrack to version 1.5. In November 2007, Apple released Final Cut Express 4, which although it did not support real-time editing in the AVCHD format (it only allowed for transcoding AVCHD to Apple Intermediate Codec (AIC) provided that the camera was actually attached to the computer - it did not convert AVCHD files stored elsewhere and is currently for Intel processors only), imported iMovie '08 projects and included 50 new filters. It did not include Soundtrack 1.5, but it still included LiveType which enables users to create advanced text for the movies they created in Final Cut. The price was dropped from $299 for version 3.5 to $199 for version 4.0. In June 2011, Final Cut Express was officially discontinued, in favor of Final Cut Pro X. == Features == Final Cut Express' interface was identical to that of Final Cut Pro, but lacks some film-specific features, including Cinema Tools, multi-cam editing, batch capture, and a time code view. The program performed 32 undo operations, while Final Cut Pro did 99 [2]. Features the program did include were: The ability to keyframe filters Dynamic RT, which changes real-time settings on-the-fly Motion path keyframing Opacity keyframing Ripple, roll, slip, slide and blade edits Picture-in-picture and split-screen effects Up to 99 video tracks and 12 compositing modes Up to 99 audio tracks Motion project import Two-way color correction. Chroma key One feature of Final Cut Express that was not available in Final Cut Pro is the ability to import iMovie '08 projects (though transitions are not preserved). === RT Extreme === Inherited from Final Cut Pro, Final Cut Express features RT Extreme, which allows previews of some video filters and transitions without rendering. Audio that is not in the native AIFF file format needs rendering before it can be played back. RT Extreme has three modes: 'Safe', for seeing multiple video layers at a quality that more or less guarantees a smooth playback; 'Unlimited', which allows the maximum number of composited video layers to be viewed at the same time; and 'Dynamic', which alternates between these settings depending on how many simultaneous video tracks are present. Frame dropping may result from using 'Unlimited' on low-resource machines. === Boris Calligraphy === Like Final Cut Pro, Express also comes with Boris Calligraphy, a plugin for advanced titling and scrolling/crawling titles more sophisticated than the ones that can be created with the built-in title overlays. Calligraphy has a WYSIWYG interface and features wrapping, alignment, leading, kerning and tracking features, as well as allowing up to five custom outlines and five custom drop shadows to be defined for a selected portion of the title. == Soundtrack == Prior to version 4, Final Cut Express included Soundtrack 1.5, a music program similar to the consumer-level GarageBand, but designed for videographers who wish to add music to their films. Soundtrack comes with around 4,000 professionally recorded instrument loops and sound effects that can be arranged in multiple tracks beneath the video track. To use Soundtrack, users export their Final Cut Express sequence, or a marked portion thereof, as a reference file, which can include scoring markers defined in the timeline. This reference file can be imported as the video track in Soundtrack. Soundtrack is functionally and visually identical to Soundtrack Pro's multitrack editing mode, but includes fewer Logic plugins and lacks the highly regarded noise removal tool. Soundtrack was removed from Final Cut Express 4, which lowered its price and may have encouraged people to buy Logic Express.

CHAOS (chess)

CHAOS (Chess Heuristics and Other Stuff) is a chess playing program that was developed by programmers working at the RCA Systems Programming division in the late 1960s. It played competitively in computer chess competitions in the 1970s and 1980s. It differed from other programs of that era in its look-ahead philosophy, choosing to use chess knowledge to evaluate fewer positions and continuations as opposed to simple evaluations that relied on deep look-ahead to avoid bad moves. == Introduction == CHAOS was originally developed by Ira Ruben, Fred Swartz, Victor Berman, Joe Winograd and William Toikka while working at RCA in Cinnaminson, NJ. Its name is an acronym for 'Chess Heuristics and Other Stuff.' Program development moved to the Computing Center of the University of Michigan when Swartz changed jobs, and Mike Alexander joined the development group. Swartz, Alexander and Berman were continuously group members from that point onward in CHAOS' evolution, as others of the original authors left and new members contributed episodically. Chess Senior Master Jack O'Keefe contributed to CHAOS' development from about 1980 onwards. CHAOS was written in Fortran, except for low-level board representation manipulations written in assembly language or C. Due to this portability, it ran on RCA, Univac and IBM-compatible mainframes in its lifetime. CHAOS heralds from the mainframe computing era when only machines of that capacity were able to play at a high level. Consequently, development and testing could only take place at off-peak times for production use of the machine. In a competition, CHAOS had to run on a dedicated mainframe with a telephone link to the match venue. In its later years, CHAOS ran on computers on the machine assembly floor of Amdahl Corporation on MTS. == Background == === Chess and artificial intelligence === Mathematicians Claude Shannon and Alan Turing, working separately, were the first to view playing chess as a challenge to machines. Working for AT&T / Bell Labs with its access to telephone switching equipment, Shannon built a relay-based machine that learned how to work its way through a two-dimensional, 5x5 cell maze in 1949. Shannon viewed this as an analogue of the way that organisms learn things about their natural environment. There is a random element to searching it, a memory element to benefit from the search outcome, and a reward element that reinforces learning when the global outcome is favorable to the organism. Soon afterward, Shannon wrote a mathematical analysis of the game of chess, published in 1950. Like with the maze, he broke down game play into the necessary elements for reinforcement learning. Associated with each board configuration a move will be made from, there is a numerical score. To decide what move to make, a player wants to maximize their own position's score after the move and to minimize their opponent's score (a minimax view). Since there are about 32 possible moves at each of the early stages of the game, and about 40 moves and responses in each game, then there are about 32 80 {\displaystyle 32^{80}} or about 10 120 {\displaystyle 10^{120}} possible games - an impossibly large set to evaluate completely. Therefore, there must be a way to limit the number of moves to look ahead for to find the best one. Reducing the game to these few key elements provided a way to think about human intelligence in general. Shannon became part of a wider group using computing machines to mimic aspects of human intelligence that grew into the general idea of artificial intelligence. (Other members of this group were John McCarthy, Herbert Simon, Allen Newell, Alan Kotok, Alex Bernstein and Richard Greenblatt.) The paradigm that evolved was that there was a quantification of the position on the board into a score, an evaluation method to find favorable outcomes (minimax, later alpha-beta pruning), and a strategy to manage the combinatorial explosion of the look-ahead possibilities. By the early 1960s, there were computer programs that played chess at a rudimentary level. They used very simple evaluation functions for each position and tried to search as far forward as was practical given the time constraints and available compute power. Naturally, programmers optimized their code to use the available computing resources. This led to a major philosophical divide among chess programs: those that tried to evaluate as many positions as possible, and those that tried to evaluate the most promising move sequences as deeply as possible. CHAOS was firmly in the camp believing only the most promising moves should be evaluated in depth. Said Swartz, "The 'brute force people' ... look at every (possible move) no matter what garbage it is. Most moves are just terrible, terrible moves, and most computing time is being spent on pure garbage." The program spent more time evaluating each board position in the expectation that it would find the most promising lines of play to explore in depth. In 1983, the then-fastest chess program (Belle) evaluated 110,000 positions per second, and typical programs 1000–50,000 per second, whereas CHAOS evaluated about 50-100 per second. === Machine learning and strategies to manage search === From about 1949 onward, Arthur Samuel began work for IBM on machine learning, culminating in a checkers-playing program in 1952 and publications on the topic. Concurrently, Christopher Strachey created Checkers, a program to play the board game of checkers in 1951, but it had no capacity to learn from its play. Checkers was chosen by both authors because it was simpler than chess yet contained the basic characteristics of an intellectual activity, and, in Samuel's view, was a test-bed in which heuristic procedures and learning processes could be evaluated quickly. Checker playing programs introduced the notion of the game tree and evaluating play to various depths to choose the best move. The complexity of chess, however, promoted it to the status of an analogue for human intelligence, and it attracted computer scientists' attention, who referred to it as research into artificial intelligence (AI). Like checkers, it required a numerical assessment of each arrangement of chess pieces on a board. It also required looking ahead to future moves to decide how to play the present position. Due to the enormous number of possible moves, there had to be a way to confine the look-ahead search to the most promising lines of play. From these factors, the notion of minimax score evaluation developed and, later, alpha-beta tree pruning to abandon looking at positions worse than any that have already been examined. === Chess search strategies === The AI community viewed artificial intelligence as comprising two parts: a way to symbolically quantify the knowledge in hand (a chess board position), and a set of heuristics to limit look-ahead to the consequences of a move. The early chess playing programs attempted to look forward as far as possible, perhaps to 3 moves ahead by each player, and to choose the best outcome. This led to the horizon effect, whereby a key move 4 or more moves ahead would be unexamined and therefore missed. Consequently, the programs were quite weak and heuristics to manage the search became important in their development. CHAOS used a selective search strategy with iterative widening. As chess programs evolved, they incorporated books of opening lines of play from historic sources. Nowadays, book moves are catalogued in machine-readable form, but originally programmers had to type them in. CHAOS had an extensive book for its time of around 10,000 moves that O'Keefe helped to develop. A problem with play from an opening book is the behavior of the program when the play leaves the book: the positional advantage may be so subtle that the evaluation scheme may be unable to understand it, leading to very wide and shallow searches to establish a line of play. The horizon effect again plagues move selection after leaving the book. CHAOS mitigated these problems by only using book lines that it could understand, and by relying on cached analyses of continuations out of the book made while the opponent's clock was running. == Game Play History == CHAOS played in twelve ACM computer chess tournaments and four World Computer Chess Championships (WCCC). Its debut was the ACM computer chess tournament in 1973, taking 2nd place. In 1974, it again won 2nd place in the WCCC, defeating the tournament favorite Chess 4.0 but losing to Kaissa. CHAOS was close to winning the 1980 WCCC, but lost to Belle in a playoff. The 1985 ACM computer chess tournament was CHAOS' last competition. One of CHAOS' notable victories was over Chess 4.0 at the 1974 WCCC tournament. Chess 4.0 was unbeaten by any other program up until then. Playing as white, CHAOS made a knight sacrifice (16 Nd4-e6!!) that traded material for open lines of attack and eventually won the game. CHAOS’ authors thought the move was due to a

Problem solving

Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to get from point A to B) to complex issues in business and technical fields. The former is an example of simple problem solving (SPS) addressing one issue, whereas the latter is complex problem solving (CPS) with multiple interrelated obstacles. Another classification of problem-solving tasks is into well-defined problems with specific obstacles and goals, and ill-defined problems in which the current situation is troublesome but it is not clear what kind of resolution to aim for. Similarly, one may distinguish formal or fact-based problems requiring psychometric intelligence, versus socio-emotional problems which depend on the changeable emotions of individuals or groups, such as tactful behavior, fashion, or gift choices. Solutions require sufficient resources and knowledge to attain the goal. Professionals such as lawyers, doctors, programmers, and consultants are largely problem solvers for issues that require technical skills and knowledge beyond general competence. Many businesses have found profitable markets by recognizing a problem and creating a solution: the more widespread and inconvenient the problem, the greater the opportunity to develop a scalable solution. There are many specialized problem-solving techniques and methods in fields such as science, engineering, business, medicine, mathematics, computer science, philosophy, and social organization. The mental techniques to identify, analyze, and solve problems are studied in psychology and cognitive sciences. Also widely researched are the mental obstacles that prevent people from finding solutions; problem-solving impediments include confirmation bias, mental set, and functional fixedness. == Definition == The term problem solving has a slightly different meaning depending on the discipline. For instance, it is a mental process in psychology and a computerized process in computer science. There are two different types of problems: ill-defined and well-defined; different approaches are used for each. Well-defined problems have specific end goals and clearly expected solutions, while ill-defined problems do not. Well-defined problems allow for more initial planning than ill-defined problems. Solving problems sometimes involves dealing with pragmatics (the way that context contributes to meaning) and semantics (the interpretation of the problem). The ability to understand what the end goal of the problem is, and what rules could be applied, represents the key to solving the problem. Sometimes a problem requires abstract thinking or coming up with a creative solution. Problem solving has two major domains: mathematical problem solving and personal problem solving. Each concerns some difficulty or barrier that is encountered. === Psychology === Problem solving in psychology refers to the process of finding solutions to problems encountered in life. Solutions to these problems are usually situation- or context-specific. The process starts with problem finding and problem shaping, in which the problem is discovered and simplified. The next step is to generate possible solutions and evaluate them. Finally a solution is selected to be implemented and verified. Problems have an end goal to be reached; how you get there depends upon problem orientation (problem-solving coping style and skills) and systematic analysis. Mental health professionals study the human problem-solving processes using methods such as introspection, behaviorism, simulation, computer modeling, and experiment. Social psychologists look into the person-environment relationship aspect of the problem and independent and interdependent problem-solving methods. Problem solving has been defined as a higher-order cognitive process and intellectual function that requires the modulation and control of more routine or fundamental skills. Empirical research shows many different strategies and factors influence everyday problem solving. Rehabilitation psychologists studying people with frontal lobe injuries have found that deficits in emotional control and reasoning can be re-mediated with effective rehabilitation and could improve the capacity of injured persons to resolve everyday problems. Interpersonal everyday problem solving is dependent upon personal motivational and contextual components. One such component is the emotional valence of "real-world" problems, which can either impede or aid problem-solving performance. Researchers have focused on the role of emotions in problem solving, demonstrating that poor emotional control can disrupt focus on the target task, impede problem resolution, and lead to negative outcomes such as fatigue, depression, and inertia. In conceptualization,human problem solving consists of two related processes: problem orientation, and the motivational/attitudinal/affective approach to problematic situations and problem-solving skills. People's strategies cohere with their goals and stem from the process of comparing oneself with others. === Cognitive sciences === Among the first experimental psychologists to study problem solving were the Gestaltists in Germany, such as Karl Duncker in The Psychology of Productive Thinking (1935). Perhaps best known is the work of Allen Newell and Herbert A. Simon. Experiments in the 1960s and early 1970s asked participants to solve relatively simple, well-defined, but not previously seen laboratory tasks. These simple problems, such as the Tower of Hanoi, admitted optimal solutions that could be found quickly, allowing researchers to observe the full problem-solving process. Researchers assumed that these model problems would elicit the characteristic cognitive processes by which more complex "real world" problems are solved. An outstanding problem-solving technique found by this research is the principle of decomposition. === Computer science === Much of computer science and artificial intelligence involves designing automated systems to solve a specified type of problem: to accept input data and calculate a correct or adequate response, reasonably quickly. Algorithms are recipes or instructions that direct such systems, written into computer programs. Steps for designing such systems include problem determination, heuristics, root cause analysis, de-duplication, analysis, diagnosis, and repair. Analytic techniques include linear and nonlinear programming, queuing systems, and simulation. A large, perennial obstacle is to find and fix errors in computer programs: debugging. === Logic === Formal logic concerns issues like validity, truth, inference, argumentation, and proof. In a problem-solving context, it can be used to formally represent a problem as a theorem to be proved, and to represent the knowledge needed to solve the problem as the premises to be used in a proof that the problem has a solution. The use of computers to prove mathematical theorems using formal logic emerged as the field of automated theorem proving in the 1950s. It included the use of heuristic methods designed to simulate human problem solving, as in the Logic Theory Machine, developed by Allen Newell, Herbert A. Simon and J. C. Shaw, as well as algorithmic methods such as the resolution principle developed by John Alan Robinson. In addition to its use for finding proofs of mathematical theorems, automated theorem-proving has also been used for program verification in computer science. In 1958, John McCarthy proposed the advice taker, to represent information in formal logic and to derive answers to questions using automated theorem-proving. An important step in this direction was made by Cordell Green in 1969, who used a resolution theorem prover for question-answering and for such other applications in artificial intelligence as robot planning. The resolution theorem-prover used by Cordell Green bore little resemblance to human problem solving methods. In response to criticism of that approach from researchers at MIT, Robert Kowalski developed logic programming and SLD resolution, which solves problems by problem decomposition. He has advocated logic for both computer and human problem solving and computational logic to improve human thinking. === Engineering === When products or processes fail, problem solving techniques can be used to develop corrective actions that can be taken to prevent further failures. Such techniques can also be applied to a product or process prior to an actual failure event—to predict, analyze, and mitigate a potential problem in advance. Techniques such as failure mode and effects analysis can proactively reduce the likelihood of problems. In either the reactive or the proactive case, it is necessary to build a causal explanation through a process of diagnosis. In deriving an explanation of effects in terms of causes, abduction generates new ideas or hypothes

Kolmogorov–Arnold Networks

Kolmogorov–Arnold Networks (KANs) are a type of artificial neural network architecture inspired by the Kolmogorov–Arnold representation theorem, also known as the superposition theorem. Unlike traditional multilayer perceptrons (MLPs), which rely on fixed activation functions and linear weights, KANs replace each weight with a learnable univariate function, often represented using splines. == History == KANs (Kolmogorov–Arnold Networks) were proposed by Liu et al. (2024) as a generalization of the Kolmogorov–Arnold representation theorem (KART), aiming to outperform MLPs in small-scale AI and scientific tasks. Before KANs, numerous studies explored KART's connections to neural networks or used it as a basis for designing new network architectures. In the 1980s and 1990s, early research applied KART to neural network design. Kůrková et al. (1992), Hecht-Nielsen (1987), and Nees (1994) established theoretical foundations for multilayer networks based on KART. Igelnik et al. (2003) introduced the Kolmogorov Spline Network using cubic splines to model complex functions. Sprecher (1996, 1997) introduced numerical methods for building network layers, while Nakamura et al. (1993) created activation functions with guaranteed approximation accuracy. These works linked KART's theoretical potential with practical neural network implementation. KART has also been used in other computational and theoretical fields. Coppejans (2004) developed nonparametric regression estimators using B-splines, Bryant (2008) applied it to high-dimensional image tasks, Liu (2015) investigated theoretical applications in optimal transport and image encryption, and more recently, Polar and Poluektov (2021) used Urysohn operators for efficient KART construction, while Fakhoury et al. (2022) introduced ExSpliNet, integrating KART with probabilistic trees and multivariate B-splines for improved function approximation. == Architecture == KANs are based on the Kolmogorov–Arnold representation theorem, which was linked to the 13th Hilbert problem. Given x = ( x 1 , x 2 , … , x n ) {\displaystyle x=(x_{1},x_{2},\dots ,x_{n})} consisting of n variables, a multivariate continuous function f ( x ) {\displaystyle f(x)} can be represented as: f ( x ) = f ( x 1 , … , x n ) = ∑ q = 1 2 n + 1 Φ q ( ∑ p = 1 n φ q , p ( x p ) ) {\displaystyle f(x)=f(x_{1},\dots ,x_{n})=\sum _{q=1}^{2n+1}\Phi _{q}\left(\sum _{p=1}^{n}\varphi _{q,p}(x_{p})\right)} (1) This formulation contains two nested summations: an outer and an inner sum. The outer sum ∑ q = 1 2 n + 1 {\displaystyle \sum _{q=1}^{2n+1}} aggregates 2 n + 1 {\displaystyle 2n+1} terms, each involving a function Φ q : R → R {\displaystyle \Phi _{q}:\mathbb {R} \to \mathbb {R} } . The inner sum ∑ p = 1 n {\displaystyle \sum _{p=1}^{n}} computes n terms for each q, where each term φ q , p : [ 0 , 1 ] → R {\displaystyle \varphi _{q,p}:[0,1]\to \mathbb {R} } is a continuous function of the single variable x p {\displaystyle x_{p}} . The inner continuous functions φ q , p {\displaystyle \varphi _{q,p}} are universal, independent of f {\displaystyle f} , while the outer functions Φ q {\displaystyle \Phi _{q}} depend on the specific function f {\displaystyle f} being represented. The representation (1) holds for all multivariate functions f {\displaystyle f} as proved in . If f {\displaystyle f} is continuous, then the outer functions Φ q {\displaystyle \Phi _{q}} are continuous; if f {\displaystyle f} is discontinuous, then the corresponding Φ q {\displaystyle \Phi _{q}} are generally discontinuous, while the inner functions φ q , p {\displaystyle \varphi _{q,p}} remain the same universal functions. Liu et al. proposed the name KAN. A general KAN network consisting of L layers takes x to generate the output as: K A N ( x ) = ( Φ L − 1 ∘ Φ L − 2 ∘ ⋯ ∘ Φ 1 ∘ Φ 0 ) x {\displaystyle \mathrm {KAN} (x)=(\Phi ^{L-1}\circ \Phi ^{L-2}\circ \cdots \circ \Phi ^{1}\circ \Phi ^{0})x} (3) Here, Φ l {\displaystyle \Phi ^{l}} is the function matrix of the l-th KAN layer or a set of pre-activations. Let i denote the neuron of the l-th layer and j the neuron of the (l+1)-th layer. The activation function φ j , i l {\displaystyle \varphi _{j,i}^{l}} connects (l, i) to (l+1, j): φ j , i l , l = 0 , … , L − 1 , i = 1 , … , n l , j = 1 , … , n l + 1 {\displaystyle \varphi _{j,i}^{l},\quad l=0,\dots ,L-1,\;i=1,\dots ,n_{l},\;j=1,\dots ,n_{l+1}} (4) where nl is the number of nodes of the l-th layer. Thus, the function matrix Φ l {\displaystyle \Phi ^{l}} can be represented as an n l + 1 × n l {\displaystyle n_{l+1}\times n_{l}} matrix of activations: x l + 1 = ( φ 1 , 1 l ( ⋅ ) φ 1 , 2 l ( ⋅ ) ⋯ φ 1 , n l l ( ⋅ ) φ 2 , 1 l ( ⋅ ) φ 2 , 2 l ( ⋅ ) ⋯ φ 2 , n l l ( ⋅ ) ⋮ ⋮ ⋱ ⋮ φ n l + 1 , 1 l ( ⋅ ) φ n l + 1 , 2 l ( ⋅ ) ⋯ φ n l + 1 , n l l ( ⋅ ) ) x l {\displaystyle x^{l+1}={\begin{pmatrix}\varphi _{1,1}^{l}(\cdot )&\varphi _{1,2}^{l}(\cdot )&\cdots &\varphi _{1,n_{l}}^{l}(\cdot )\\\varphi _{2,1}^{l}(\cdot )&\varphi _{2,2}^{l}(\cdot )&\cdots &\varphi _{2,n_{l}}^{l}(\cdot )\\\vdots &\vdots &\ddots &\vdots \\\varphi _{n_{l+1},1}^{l}(\cdot )&\varphi _{n_{l+1},2}^{l}(\cdot )&\cdots &\varphi _{n_{l+1},n_{l}}^{l}(\cdot )\end{pmatrix}}x^{l}} == Implementations == To make the KAN layers optimizable, the inner function is formed by the combination of spline and basic functions as the formula: φ ( x ) = w b b ( x ) + w s spline ( x ) {\displaystyle \varphi (x)=w_{b}\,b(x)+w_{s}\,{\text{spline}}(x)} where b ( x ) {\displaystyle b(x)} is the basic function, usually defined as s i l u ( x ) = x / ( 1 + e x ) {\displaystyle silu(x)=x/(1+e^{x})} and w b {\displaystyle w_{b}} is the base weight matrix. Also, w s {\displaystyle w_{s}} is the spline weight matrix and spline ( x ) {\displaystyle {\text{spline}}(x)} is the spline function. The spline function can be a sum of B-splines. spline ( x ) = ∑ i c i B i ( x ) {\displaystyle {\text{spline}}(x)=\sum _{i}c_{i}B_{i}(x)} Many studies suggested to use other polynomial and curve functions instead of B-spline to create new KAN variants. == Functions used == The choice of functional basis strongly influences the performance of KANs. Common function families include: B-splines: Provide locality, smoothness, and interpretability; they are the most widely used in current implementations. RBFs (include Gaussian RBFs): Capture localized features in data and are effective in approximating functions with non-linear or clustered structures. Chebyshev polynomials: Offer efficient approximation with minimized error in the maximum norm, making them useful for stable function representation. Rational function: Useful for approximating functions with singularities or sharp variations, as they can model asymptotic behavior better than polynomials. Fourier series: Capture periodic patterns effectively and are particularly useful in domains such as physics-informed machine learning. Wavelet functions (DoG, Mexican hat, Morlet, and Shannon): Used for feature extraction as they can capture both high-frequency and low-frequency data components. Piecewise linear functions: Provide efficient approximation for multivariate functions in KANs. == Usage == In some modern neural architectures like convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Transformers, KANs are typically used as drop-in substitutes for MLP layers. Despite KANs' general-purpose design, researchers have created and used them for a number of tasks: Scientific machine learning (SciML): Function fitting, partial differential equations (PDEs) and physical/mathematical laws. Continual learning: KANs better preserve previously learned information during incremental updates, avoiding catastrophic forgetting due to the locality of spline adjustments. Graph neural networks: Extensions such as Kolmogorov–Arnold Graph Neural Networks (KA-GNNs) integrate KAN modules into message-passing architectures, showing improvements in molecular property prediction tasks. Sensor data processing: Kolmogorov–Arnold Networks (KANs) have recently been applied to sensor data processing due to their ability to model complex nonlinear relationships with relatively few parameters and improved interpretability compared to conventional multilayer perceptrons. Applications include industrial soft sensors, biomedical signal analysis, remote sensing, and environmental monitoring systems. == Drawbacks == KANs can be computationally intensive and require a large number of parameters due to their use of polynomial functions to capture data.

Dropbox Carousel

Dropbox Carousel was a photo and video management app offered by Dropbox. The third-party native app, available on Android and iOS, allowed users to store, manage, and organize photos. Photos were organized by date, time and event and backed up on Dropbox. It competed in this space against other online photo storage services such as Google's Google Photos, Apple's iCloud, and Yahoo's Flickr. Chris Lee, Dropbox's head of product development for Carousel described the app as an add-on to Dropbox, a “dedicated experience for photos and videos” and a space for “reliving personal memories”. == History == Mailbox founder, Gentry Underwood unveiled Carousel at a gathering in San Francisco on April 9, 2014. Much of the features in Carousel come from Snapjoy, a photo start-up, that Dropbox acquired on December 19, 2012. When Carousel was launched, it marked amongst many others, a series of acquisitions made by Dropbox to prep up before opening its stock for public offering. The acquisitions would help demonstrate its expansive product offerings pitching potential profitability to investors. In December 2015, Dropbox announced that Carousel would be shut down and some Carousel features would be integrated into the primary Dropbox application. On March 31, 2016, Carousel was deactivated. == Features == Carousel prompted users to free local storage once it had synced and backed-up local photos to the cloud. Flashback was a feature (enabled by default) that showed past photos or videos taken the same day, a year, or some years back. Flashback used an algorithm designed to identify human faces - resulting in greater likelihood of the user's picture or people in the user's close circle appearing. A scrollable timeline, which was earlier a scroll wheel, at the bottom let the user scroll to photo(s) at a specific date with a finger swipe.

Semantic folding

Semantic folding theory describes a procedure for encoding the semantics of natural language text in a semantically grounded binary representation. This approach provides a framework for modelling how language data is processed by the neocortex. == Theory == Semantic folding theory draws inspiration from Douglas R. Hofstadter's Analogy as the Core of Cognition which suggests that the brain makes sense of the world by identifying and applying analogies. The theory hypothesises that semantic data must therefore be introduced to the neocortex in such a form as to allow the application of a similarity measure and offers, as a solution, the sparse binary vector employing a two-dimensional topographic semantic space as a distributional reference frame. The theory builds on the computational theory of the human cortex known as hierarchical temporal memory (HTM), and positions itself as a complementary theory for the representation of language semantics. A particular strength claimed by this approach is that the resulting binary representation enables complex semantic operations to be performed simply and efficiently at the most basic computational level. == Two-dimensional semantic space == Analogous to the structure of the neocortex, Semantic Folding theory posits the implementation of a semantic space as a two-dimensional grid. This grid is populated by context-vectors in such a way as to place similar context-vectors closer to each other, for instance, by using competitive learning principles. This vector space model is presented in the theory as an equivalence to the well known word space model described in the information retrieval literature. Given a semantic space (implemented as described above) a word-vector can be obtained for any given word Y by employing the following algorithm: For each position X in the semantic map (where X represents cartesian coordinates) if the word Y is contained in the context-vector at position X then add 1 to the corresponding position in the word-vector for Y else add 0 to the corresponding position in the word-vector for Y The result of this process will be a word-vector containing all the contexts in which the word Y appears and will therefore be representative of the semantics of that word in the semantic space. It can be seen that the resulting word-vector is also in a sparse distributed representation (SDR) format [Schütze, 1993] & [Sahlgreen, 2006]. Some properties of word-SDRs that are of particular interest with respect to computational semantics are: high noise resistance: As a result of similar contexts being placed closer together in the underlying map, word-SDRs are highly tolerant of false or shifted "bits". boolean logic: It is possible to manipulate word-SDRs in a meaningful way using boolean (OR, AND, exclusive-OR) and/or arithmetical (SUBtract) functions . sub-sampling: Word-SDRs can be sub-sampled to a high degree without any appreciable loss of semantic information. topological two-dimensional representation: The SDR representation maintains the topological distribution of the underlying map therefore words with similar meanings will have similar word-vectors. This suggests that a variety of measures can be applied to the calculation of semantic similarity, from a simple overlap of vector elements, to a range of distance measures such as: Euclidean distance, Hamming distance, Jaccard distance, cosine similarity, Levenshtein distance, Sørensen-Dice index, etc. == Semantic spaces == Semantic spaces in the natural language domain aim to create representations of natural language that are capable of capturing meaning. The original motivation for semantic spaces stems from two core challenges of natural language: Vocabulary mismatch (the fact that the same meaning can be expressed in many ways) and ambiguity of natural language (the fact that the same term can have several meanings). The application of semantic spaces in natural language processing (NLP) aims at overcoming limitations of rule-based or model-based approaches operating on the keyword level. The main drawback with these approaches is their brittleness, and the large manual effort required to create either rule-based NLP systems or training corpora for model learning. Rule-based and machine learning-based models are fixed on the keyword level and break down if the vocabulary differs from that defined in the rules or from the training material used for the statistical models. Research in semantic spaces dates back more than 20 years. In 1996, two papers were published that raised a lot of attention around the general idea of creating semantic spaces: latent semantic analysis from Microsoft and Hyperspace Analogue to Language from the University of California. However, their adoption was limited by the large computational effort required to construct and use those semantic spaces. A breakthrough with regard to the accuracy of modelling associative relations between words (e.g. "spider-web", "lighter-cigarette", as opposed to synonymous relations such as "whale-dolphin", "astronaut-driver") was achieved by explicit semantic analysis (ESA) in 2007. ESA was a novel (non-machine learning) based approach that represented words in the form of vectors with 100,000 dimensions (where each dimension represents an Article in Wikipedia). However practical applications of the approach are limited due to the large number of required dimensions in the vectors. More recently, advances in neural networking techniques in combination with other new approaches (tensors) led to a host of new recent developments: Word2vec from Google and GloVe from Stanford University. Semantic folding represents a novel, biologically inspired approach to semantic spaces where each word is represented as a sparse binary vector with 16,000 dimensions (a semantic fingerprint) in a 2D semantic map (the semantic universe). Sparse binary representation are advantageous in terms of computational efficiency, and allow for the storage of very large numbers of possible patterns. == Visualization == The topological distribution over a two-dimensional grid (outlined above) lends itself to a bitmap type visualization of the semantics of any word or text, where each active semantic feature can be displayed as e.g. a pixel. As can be seen in the images shown here, this representation allows for a direct visual comparison of the semantics of two (or more) linguistic items. Image 1 clearly demonstrates that the two disparate terms "dog" and "car" have, as expected, very obviously different semantics. Image 2 shows that only one of the meaning contexts of "jaguar", that of "Jaguar" the car, overlaps with the meaning of Porsche (indicating partial similarity). Other meaning contexts of "jaguar" e.g. "jaguar" the animal clearly have different non-overlapping contexts. The visualization of semantic similarity using Semantic Folding bears a strong resemblance to the fMRI images produced in a research study conducted by A.G. Huth et al., where it is claimed that words are grouped in the brain by meaning. voxels, little volume segments of the brain, were found to follow a pattern were semantic information is represented along the boundary of the visual cortex with visual and linguistic categories represented on posterior and anterior side respectively.