Resumen de: US20260063424A1
Techniques for localizing a vehicle in real time using dynamic uncertainty estimates are presented. The techniques include obtaining a terrain image captured by the vehicle; passing the terrain image to a trained evidential deep learning neural network subsystem, from which a dynamic uncertainty value and a first feature vector are obtained in real time; for each of a plurality of candidate terrain locations, comparing the first feature vector to a respective second feature vector representative of a candidate terrain location, from which a respective similarity score is obtained; for at least one of the plurality of candidate terrain locations, updating in real time, by a recursive Bayesian estimator, a respective location weight based on the dynamic uncertainty value and the respective similarity score; estimating, in real time, a location of the vehicle based on the plurality of location weights; and providing the location of the vehicle.
Resumen de: WO2026050564A1
A method for generating a classifier, comprising: initializing a neural network with a fully connected architecture; applying a regularization constraint to the set of weights between neurons in the input layer and the plurality of latent features in the hidden layer; iteratively reducing a number of incoming connections to each latent feature in the hidden layer to a predetermined number based on the regularized first set of weights; weighting loss function's value based on categories of activation tuples, wherein contributions of data entries with activation tuples of size greater than a predetermined threshold to the loss function's value are limited and wherein contributions of data entries with activation tuples of size "0" to the loss function's value are minimized, to a predefined percentage; and selectively updating sets of weights associated with top-ranked latent features as evaluated by the magnitude of their contributions at the output layer in a training process.
Resumen de: WO2026046614A1
The invention relates to a system for a vehicle (0), comprising a computing module (1) having a neural network and an input data interface (3) designed to provide past movements of other road users and current sensor data, wherein a first (5) and a second (7) sub-network make a location-related or a time-related prediction of the future behavior of other road users, and the input data interface (3) carries out reinforcement learning with the aid of an anomaly detection unit (9) for adapting weightings of a machine learning attention mechanism.
Resumen de: US20260065662A1
An electronic device includes: a processor; and a memory storing instructions. By executing the instructions, the processor is configured to: receive a first image, recognize a plurality of objects in the first image to generate object information representing the plurality of objects, generate an object relationship graph including relationships between the plurality of objects, based on the first image and the object information, obtain image effect data including image effects to be respectively applied to the plurality of objects by inputting the object relationship graph to an image modification Graph Neural Network (GNN) model, and generate a modified image based on the first image, the object information, and the image effect data.
Resumen de: US20260064726A1
A method and system for providing an intelligent response agent based on a sophisticated reasoning and speculation function can generate and provide response data for queries related to specialized documents using a deep-learning neural network that implements a stepwise process for a sophisticated reasoning and speculation function.
Resumen de: WO2026049296A1
An electronic device is disclosed. The present electronic device comprises: a neural network model including an acoustic encoder and a text encoder trained on relationships between a plurality of first sample sounds and a plurality of sample texts; a memory for storing instructions; and at least one processor including processing circuitry, wherein the instructions, when executed individually or collectively by the at least one processor, may: input target text corresponding to a target keyword into the text encoder to obtain a text embedding; and retrain the neural network model on the basis of a plurality of second sample sounds including the target keyword and the text embedding to obtain a final neural network model in which the acoustic encoder is updated.
Resumen de: WO2026046493A1
A method for a base station associated with one or more terminal devices comprising determining an initial state at a first time, a resource allocation for the one or more terminal devices associated with the base station based on a Deep Neural Network (DNN) policy, causing the resource allocation to be executed, determining a resulting state at a second time preceding the execution of the resource allocation, a local reward for the base station, and local importance values for a current transition and historical transitions. Further, receiving neighbouring importance values for a current transition and historical transitions of a neighbouring base station, determining global importance values based on the local importance values and neighbouring importance values. Determining which transition of a current transition and historical transitions has the lowest global importance value and drop it to store the remaining transitions. Receiving neighbouring rewards for previous transitions and training the DNN policy based on previous transitions and the received neighbouring rewards.
Resumen de: WO2026046490A1
A method for a base station associated with one or more terminal devices comprising determining an initial state at a first time, determining a resource allocation for the one or more terminal devices associated with the base station based on a Deep Neural Network, DNN, policy, causing the resource allocation to be executed, determining a resulting state at a second time preceding the execution of the resource allocation, determining a local reward for the base station, receiving a neighbouring reward from a neighbouring base station, determining a group reward based on the local reward and the received neighbouring reward, receiving a previous neighbouring hyper parameter, updating a local hyper parameter based on the previous neighbouring hyper parameter and the group reward, wherein updating the local hyper parameter utilizes a hyper DNN, policy and training the DNN policy based on transitions of the base station and the updated local hyper parameter.
Resumen de: US20260057486A1
The present disclosure provides an apparatus and method of guided neural network model for image processing. An apparatus may comprise a guidance map generator, a synthesis network and an accelerator. The guidance map generator may receive a first image as a content image and a second image as a style image, and generate a first plurality of guidance maps and a second plurality of guidance maps, respectively from the first image and the second image. The synthesis network may synthesize the first plurality of guidance maps and the second plurality of guidance maps to determine guidance information. The accelerator may generate an output image by applying the style of the second image to the first image based on the guidance information.
Resumen de: US20260057685A1
Methods, systems, and computer readable storage media for performing operations comprising: obtaining a plurality of initial network inputs that have been classified as belonging to a corresponding ground truth class; processing each of the plurality of initial network inputs using a trained target neural network to generate a respective predicted network output for each initial network input, the respective predicted network output comprising a respective score for each of a plurality of classes, the plurality of classes comprising the ground truth class; identifying, based on the respective predicted network outputs and the ground truth class, a subset of the initial network inputs as having been misclassified by the trained target neural network; and determining, based on the subset of initial network inputs, one or more failure case latent representations, wherein each failure case latent representation is a latent representation that characterizes network inputs that belong to the ground truth class but that are likely to be misclassified by the trained target neural network.
Resumen de: US20260057234A1
A method and a device for training a graph neural network are provided. The method may be performed by a graphics processing unit (GPU), and may include determining at least one batch of training data; transmitting batch information corresponding to the determined at least one batch to at least one memory expansion device, so that the at least one memory expansion device acquires feature data for one or more data blocks of the at least one batch based on the batch information, receiving the feature data from the at least one memory expansion device; and training the graph neural network based on the feature data.
Resumen de: US20260056983A1
A method and system for providing an intelligent response agent based on a sophisticated reasoning and speculation function can generate and provide response data for queries related to specialized documents using a deep-learning neural network that implements a stepwise process for a sophisticated reasoning and speculation function.
Resumen de: US20260050438A1
One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a one-bit weight associated with a neural network, as well as an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a fused operation including an exclusive not OR (XNOR) operation and a population count operation. The adder is configured to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.
Resumen de: US20260051148A1
A processor-implemented method for implementing graph cuts for explainability using an artificial neural network (ANN) includes receiving, via the ANN, an input. The input is represented as a graph. The graph includes nodes connected by edges. The ANN determines a graph cut between a source node and a sink node associated with the input by solving a quadratic process with equality constraints. The ANN processes a subset of the input based on the graph cut to generate a prediction.
Resumen de: US20260051318A1
A method includes receiving training utterances that include non-synthetic speech training utterances and synthetic speech utterances. For each training utterance, the method includes processing, using a memorized neural network, a corresponding sequence of input audio frames to generate a hotword detection output indicating a likelihood the training utterance includes a hotword, determining a first loss based on the hotword detection output, obtaining a hidden layer feature vector for each corresponding input audio frame; processing, using a speech classification model, the hidden layer feature vectors to predict a classification output for the training utterance; and determining an adversarial loss based on the classification output predicted for the training utterance. The method also includes training the memorized neural network on the first losses and the adversarial losses to teach the memorized neural network to learn how to detect the hotword in audio and prevent overfitting of the synthetic speech training utterances.
Resumen de: US2024428056A1
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing tasks. One of the methods includes obtaining a sequence of input tokens, where each token is selected from a vocabulary of tokens that includes text tokens and audio tokens, and wherein the sequence of input tokens includes tokens that describe a task to be performed and data for performing the task; generating a sequence of embeddings by embedding each token in the sequence of input tokens in an embedding space; and processing the sequence of embeddings using a language model neural network to generate a sequence of output tokens for the task, where each token is selected from the vocabulary.
Resumen de: WO2026034826A1
A method includes receiving a user utterance. The method also includes providing the user utterance to a first convolutional recurrent neural network (RNN) classifier and a second convolutional RNN classifier to process the user utterance and provide outputs to a first joint layer. The method also includes providing the user utterance to an automated speech recognition (ASR) model to process the user utterance and provide a text transcript to a text classifier. The method also includes combining the outputs from the first convolutional RNN classifier and the second convolutional RNN classifier using the first joint layer. The method also includes combining outputs from the first joint layer and the text classifier using a second joint layer. The method also includes determining an audio class based on a result from the second joint layer, wherein the audio class indicates whether the user utterance includes speech intended for further processing.
Resumen de: JP2025166199A
To provide a decoder, an encoder, a neural network controller and a method allowing for efficient representation and transmission of neural network parameters of a learning or update process.SOLUTION: A decoder for decoding parameters of a neural network obtains a plurality of neural network parameters of the neural network on the basis of an encoded bitstream (1010), and obtains, from the encoded bitstream, node information describing a node of a parameter update tree (1020). The node information includes a parent node identifier and parameter update information. The decoder also derives one or more neural network parameters using parameter information of a parent node identified by the parent node identifier and using the parameter update information (1030).SELECTED DRAWING: Figure 10
Resumen de: US20260045258A1
A method includes receiving a user utterance. The method also includes providing the user utterance to a first convolutional recurrent neural network (RNN) classifier and a second convolutional RNN classifier to process the user utterance and provide outputs to a first joint layer. The method also includes providing the user utterance to an automated speech recognition (ASR) model to process the user utterance and provide a text transcript to a text classifier. The method also includes combining the outputs from the first convolutional RNN classifier and the second convolutional RNN classifier using the first joint layer. The method also includes combining outputs from the first joint layer and the text classifier using a second joint layer. The method also includes determining an audio class based on a result from the second joint layer, wherein the audio class indicates whether the user utterance includes speech intended for further processing.
Nº publicación: EP4689930A1 11/02/2026
Solicitante:
MICROSOFT TECHNOLOGY LICENSING LLC [US]
Microsoft Technology Licensing, LLC
Resumen de: US2024330679A1
A method for making predictions pertaining to entities represented within a heterogeneous graph includes: identifying, for each node in the heterogeneous graph structure, a set of node-target paths that connect the node to a target node; assigning, to each of the node-target paths identified for each node, a path type identifier indicative of a number of edges and corresponding edge types in the associated node-target path; and extracting a semantic tree from the heterogeneous graph structure. The semantic tree includes the target node as a root node and defines a hierarchy of metapaths that each individually correspond to a subset of the node-target paths in the heterogeneous graph structure assigned to a same path type identifier. The semantic tree is encoded, using one or more neural networks by generating a metapath embedding corresponding to each metapath in the semantic tree. Each of the resulting metapath embeddings encodes aggregated feature-label data for nodes in the heterogeneous graph structure corresponding to the path type identifier corresponding to the metapath associated with the metapath embedding. A label is predicted for the target node in the heterogeneous graph structure based on the set of metapath embeddings.