Computer Science
http://hdl.handle.net/10012/9930
2023-03-21T09:27:43ZA Generalized Blending Scheme for Arbitrary Order of Continuity
http://hdl.handle.net/10012/19215
A Generalized Blending Scheme for Arbitrary Order of Continuity
Fang, Xiang
In this thesis, new templates and formulas of blending functions, schemes, and algorithms are derived for solving the scattered data interpolation problem. The resulting data fitting scheme interpolates the positions and derivatives of a triangular mesh, and for each triangle of the mesh blends three triangular sub-surfaces, and creates a triangular patch. Similar to some existing schemes, the resulting surface inherits the derivatives of the sub-surfaces on the boundaries. In contrast with existing schemes, the new scheme has additional properties: The order of interpolated derivatives is extended to arbitrary values, and the restrictions of the sub-surfaces are relaxed. Then based on the properties of the new blending functions, an algorithm for constructing smooth triangular surfaces with global geometric continuity is described. The new blending functions and the scheme are then extended to multi-sided faces. The algorithm using these new blending functions accepts data sites formed by multi-sided polygons.
2023-03-20T00:00:00ZSystems and Algorithms for Dynamic Graph Processing
http://hdl.handle.net/10012/19195
Systems and Algorithms for Dynamic Graph Processing
Ammar, Khaled
Data generated from human and systems interactions could be naturally represented as graph data. Several emerging applications rely on graph data, such as the semantic web, social networks, bioinformatics, finance, and trading among others. These applications require graph querying capabilities which are often implemented in graph database management systems (GDBMS). Many GDBMSs have capabilities to evaluate one-time versions of recursive or subgraph queries over static graphs – graphs that do not change or a single snapshot of a changing graph. They generally do not support incrementally maintaining queries as graphs change. However, most applications that employ graphs are dynamic in nature resulting in graphs that change over time, also known as dynamic graphs.
This thesis investigates how to build a generic and scalable incremental computation solution that is oblivious to graph workloads. It focuses on two fundamental computations performed by many applications: recursive queries and subgraph queries. Specifically, for
subgraph queries, this thesis presents the first approach that (i) performs joins with worstcase optimal computation and communication costs; and (ii) maintains a total memory footprint almost linear in the number of input edges. For recursive queries, this thesis studies optimizations for using differential computation (DC). DC is a general incremental computation that can maintain the output of a recursive dataflow computation upon changes. However, it requires a prohibitively large amount of memory because it maintains differences that track changes in queries input/output. The thesis proposes a suite of optimizations that are based on reducing the number of these differences and recomputing them when necessary. The techniques and optimizations in this thesis, for subgraph and recursive computations, represent a proposal for how to build a state-of-the-art generic and
scalable GDBMS for dynamic graph data management.
2023-03-10T00:00:00ZAlgorithms for Geometric Facility Location: Centers in a Polygon and Dispersion on a Line
http://hdl.handle.net/10012/19182
Algorithms for Geometric Facility Location: Centers in a Polygon and Dispersion on a Line
Naredla, Anurag Murty
We study three geometric facility location problems in this thesis.
First, we consider the dispersion problem in one dimension. We are given an ordered list
of (possibly overlapping) intervals on a line. We wish to choose exactly one point from
each interval such that their left to right ordering on the line matches the input order.
The aim is to choose the points so that the distance between the closest pair of points is
maximized, i.e., they must be socially distanced while respecting the order. We give a new
linear-time algorithm for this problem that produces a lexicographically optimal solution.
We also consider some generalizations of this problem.
For the next two problems, the domain of interest is a simple polygon with n vertices.
The second problem concerns the visibility center. The convention is to think of a polygon
as the top view of a building (or art gallery) where the polygon boundary represents opaque
walls. Two points in the domain are visible to each other if the line segment joining them
does not intersect the polygon exterior. The distance to visibility from a source point to a
target point is the minimum geodesic distance from the source to a point in the polygon
visible to the target. The question is: Where should a single guard be located within the
polygon to minimize the maximum distance to visibility? For m point sites in the polygon,
we give an O((m + n) log (m + n)) time algorithm to determine their visibility center.
Finally, we address the problem of locating the geodesic edge center of a simple polygon—a
point in the polygon that minimizes the maximum geodesic distance to any edge. For a
triangle, this point coincides with its incenter. The geodesic edge center is a generalization
of the well-studied geodesic center (a point that minimizes the maximum distance to any
vertex). Center problems are closely related to farthest Voronoi diagrams, which are well-
studied for point sites in the plane, and less well-studied for line segment sites in the plane.
When the domain is a polygon rather than the whole plane, only the case of point sites has
been addressed—surprisingly, more general sites (with line segments being the simplest
example) have been largely ignored. En route to our solution, we revisit, correct, and
generalize (sometimes in a non-trivial manner) existing algorithms and structures tailored
to work specifically for point sites. We give an optimal linear-time algorithm for finding
the geodesic edge center of a simple polygon.
2023-02-24T00:00:00ZNovel Methods for Natural Language Modeling and Pretraining
http://hdl.handle.net/10012/19175
Novel Methods for Natural Language Modeling and Pretraining
Bai, He
This thesis is about modeling language sequences to achieve lower perplexity, better generation, and benefit downstream language tasks; specifically, this thesis addresses the importance of natural language features including the segmentation feature, lexical feature, and alignment feature. We present three new techniques that improve language sequence modeling with different language features.
1. Segment-Aware Language Modeling is a novel model architecture leveraging the text segementation feature for text sequence modeling. It encodes richer positional information for language modeling, by replacing the original token position encoding with a combined position encoding of paragraph, sentence, and token. By applying our approach to Transformer-XL, we train a new language model, Segatron-XL, that achieves a 6.6-7.8% relative reduction in perplexity. Additionally, BERT pretrained with our method -- SegaBERT -- outperforms BERT on general language understanding, sentence representation learning, and machine reading comprehension tasks. Furthermore, our SegaBERT-large model outperforms RoBERTa-large on zero-shot STS tasks. These experimental results demonstrate that our proposed Segatron works on both language models with relative position embeddings and pretrained language models with absolute position embeddings.
2. Hypernym-Instructed Language Modeling is a novel training method leveraging the lexical feature for rare word modeling. It maps words that have a common WordNet hypernym to the same class and trains large neural LMs by gradually annealing from predicting the class to token prediction during training. Class-based prediction leads to an implicit context aggregation for similar words and thus can improve generalization for rare words. Empirically, this curriculum learning strategy consistently reduces perplexity over various large, highly-performant state-of-the-art Transformer-based models on two datasets, WikiText-103 and ArXiv. Our analysis shows that the performance improvement is achieved without sacrificing performance on rare words.
3. Alignment-Aware Acoustic and Text Modeling is a novel pretraining method leveraging both the segmentation and alignment features for text-speech sequence modeling. It reconstructs masked acoustic signals with text input and
acoustic-text alignment during training. In this way, the pretrained model can generate high quality of reconstructed spectrogram, which can be applied to the speech editing and new speaker TTS directly. Experiments show A3T outperforms SOTA models on speech editing and improves multi-speaker speech synthesis without the external speaker verification model.
2023-02-21T00:00:00Z