Research
Optimization
Plants use electric fields to communicate with bees. As they zero in on their sugary reward, foraging bumblebees follow an invisible clue: electric fields. Bumblebees are able to find and decipher weak electric signals emitted by flowers. The signalling game between plants and pollinators (or transport and riders) – as some flowers trick their pollinators into pollinating without offering any food. Cheater flowers from the same species can be of different colors, and this might confuse the pollinators, because bees have to visit many flowers before learning that all of the colors represent cheater flowers.
The transportation problem in operations research aims to minimize costs by optimizing the allocation of variety of features – from multiple sources to destinations, considering supply, demand, queue or density, time, fuel costs, alternative transportation constraints, etc.. This multi-dimensional optimal problem is to determine the ideal allocation of capacity and assign resources. In order to provide decision-makers with a comprehensive set of options to reduce costs or minimize transportation time and maximize reward, a combinatorial game-theoretic approach is employed. This co-evolutionary relationship turns out to be rich and complex. It involves cooperation like between plants and their animal vectors, to the benefit of each, but also involves competition and adaptive compromise.
Reward-Objective Inference
We set optimization of action-response functions in signaling game situations, given latent features and patterns in use of latent features such as colours, shapes, patterns, fragrant volatiles, and, in some cases, temperature and tactile cues, to achieve reward and save hours of training and learning by optimizing set of choice-sets.
To investigate this, we compiled two sets of data: a) thousands of hours of video and audio data of flower petal changing colors; and b) thousands of hours of video and audio of transportation to set and simulate disordered nanostructures enable machines to produce visual signals that are salient. These disordered (nano)structures (identified in most major transportation) have distinct anatomies but convergent optical properties; they all produce breakdown in operation, predominantly while optimizing reward conditions. The machine learned which colors represent flowers with no food, and they preferred to take a risk visiting transportation mode with new conditions, instead of the ones they already recognized as cheaters.
The goal is to estimate the probability density function over a set of random variables underlies majority of learning frameworks and heavily depends on the partition and response function. Partition function is a normalizer of a density function and ensures that it integrates to 1. This function needs to be minimized when learning proper data distribution. Optimizing the partition and response function however is a hard and often intractable problem.
There exists a variety of Markov Chain Monte Carlo methods for approximately maximizing the likelihood of models with partition functions such as i) contrastive divergence, a stochastic gradient descent procedure to compute model parameter update, ii) fast persistent contrastive divergence, which relies on re-parametrizing the model and introducing the parameters that are trained with much larger learning rate such that the Markov chain is forced to mix rapidly. Among these techniques we have predictive controls, iterative scaling schemes, Expectation-Maximization algorithm, non-negative matrix factorization method, convex-concave procedure, minimization by incremental surrogate optimization, and technique based on constructing quadratic partition function bound.
We find three general conditions that are required for the evolutionary stability of signaling. Those conditions are satisfied if there is (a) a high frequency of high-yield signalling flowers in the population, (b) the balance between cost and benefit of signalling transportation congestion or density, and (c) high cost of dishonest signalling. This type of pollinator-plant interactions, or transport-rider interactions with information asymmetry could be understood as a problem of “partner choice”. Signalling could provide a mechanism to solve this problem. The plants (or transport) could signal their reward quality or quantity, and the pollinators (or rider) could use the plants’ signals to determine whether or not to visit the flower (or use the road).
Knowing multiple good alternatives can help decision-makers easily switch solutions when needed, such as when faced with unforeseen constraints. A partition-based random search and predictive controls for multimodal optimization task aims to find multiple global optima as well as high-quality local optima of an optimization problem.
In this signalling game, the central question is how the honest signals have been established when the interests of signallers and receivers conflict partly. On the one hand, because floral signals and rewards are uncoupled within one flower, machine could reduce the cost by sending dishonest signals (signals with low trade-offs and utility to the amount of rewards). According to costly signaling theory, the cost of signals is an essential element for stable honest signaling in any signalling scenario and it thus extends to the signallers and receivers interactions. If the benefits for low-quality and high-quality signallers are same, the effectiveness of signalling depends on the strength of the trade-offs and utility between the cost and the quality desired by the receiver: the cost must be higher for the low-quality sender. When the signal has a cost, only good quality signals will find it profitable to advertise their quality, therefore the signal will be honest. On the other hand, receivers usually rely on signals to assess the optimality of rewards.
The foraging behaviour of receivers could lead to the honest signals offered by signallers though repeated interactions with plants. Receivers, such as bumble bees (or machine in this case), not only learn positive or adverse associations between signals and rewards, but also gather the information about reward amount and use this information to improve their subsequent foraging efficiency in repeated interactions with plants (or signallers). The machine could remember the most profitable rewards and return preferentially to these. The experienced learning, over time, could discriminate between rewarding and less-rewarding signals and return more frequently to signals providing high amounts of reward.